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Over the past several decades, the field of digital signal processing (DSP) has grown "rom a 
theoretical infancy to a powerful practical tool and matured into an economical yet successful 
technology. A major reason for its success in practice is due to the development of low cost 
digital hardware and in particular, special purpose single chip DSP microprocessors and 
microcomputers. Therefore an effective education in DSP must include not only the theory but 
also a practical element in a laboratory environment. This book is intended to provide s jch an 
element and bridge the gap between theory and practice. The laboratory experiments and p: ojects 
in this book are based on the ADSP-2101, a DSP microcomputer manufactured by Analog 
Devices, Inc. In addition the book is designed as a manual containing brief yet sufficient 
information on the total system development aspects of the ADSP-2101. 

We assume that the student (or user) is familiar with the fundamentals of discrete linear systems 
and concurrently taking a course in DSP. These fundamentals are not covered in this book since 
several excellent books on DSP are available. We also assume that the student (or user) is 
knowledgeable in basic computer programming although no prior exposure to the actual 
ADSP-2101 assembly language is expected. Since the development of the experimerts and 
projects is performed on a personal computer (PC), some knowledge of the PC and its operating 
system (DOS) is essential. 

The following is a list of chapters and a brief description of their contents: 

Chapter 1, Introduction to the ADSP-2100/2101 Family: This chapter provides a ready- 
reference on the building blocks of the microcomputer. Included is a brief description of the 
core internal architecture of the ADSP-2100 family containing computational units, data address 
generators, and program sequencer; a summary of the unique additional features of the 
ADSP-2101 including timer, serial ports, and memories; an illustration of a basic s ystem 
configuration with system and memory interfacing. 

Chapter 2, ADSP-2101 Instruction Set Overview: This chapter provides sufficient information 
to understand the nature of programming the ADSP-2101 and the capabilities of its instruction 
set. Included is a comprehensive summary with examples of computational instruction >, data 
move instructions, program control instructions, multifunction instructions, and other miscel- 
laneous instructions; a description of data structures used in programming. 

ix 



Chapter 3, Overview of Development Tools: Included is a description of development tools for 
translating and debugging DSP source code: target system builder, the assembler, the linker, 
the simulator, the in-circuit emulator EZ-ICE, and the evaluation board EZ-L A.B; a discussion 
on host computer requirements. 

Chapter 4, Getting Started with the ADSP-2101: This chapter provides hands-on training on 
important aspects of system development. It includes learning of target system description and 
specification; management of simulator window environment and navigation; a description of 
simulator commands; instruction set work-out; a complete description of EZ-ICE firmware and 
window commands. 

Chapter 5, Laboratory Experiments Using the ADSP-2101 : This chapter is devoted to 
experiments which incorporate all basic operations done in DSP. It deals with simple programs 
and experiments in A/D and D/A conversion, signal delay and echo generation, convolution 
operation and recursive filtering, and waveform generation. 

Chapter 6, FIR Filter Implementations: This chapter contains projects on FIP filters. Included 
is an overview of finite impulse response filter structures and design techniques; a single- 
precision direct-form implementation; a double-precision implementation; a lattice filter 
implementation; a single sideband modulator. 

Chapter 7, IIR Filter Implementations: This chapter contains projects on IIR filters. Included 
is a summary of infinite impulse response filter structures and design methods; a direct form 
implementation; a cascade form implementation; an all-pole lattice filter implementation. 

Chapter 8, Fast Fourier Transform Implementations: Included is a review of the discrete Fourier 
transform (DFT); a complete description and implementation of decima;ion-in-time and 
decimation-in-frequency fast Fourier transform algorithms; the inverse DFT and its imple- 
mentation. 

Chapter 9, Applications in Communications: This chapter focuses on several experiments 
dealing with waveform representation and coding, and with digital communications. Included 
is a description of pulse code modulation (PCM), differential PCM (DPCM) anc adaptive DPCM 
(ADPCM), delta modulation (DM) and adaptive DM (ADM), linear predictive coding (LPC); 
generation and detection of dual-tone multifrequency (DTMF) signals; a description of signal 
detection applications in binary communications and spread spectrum communications. 

Chapter 10, Adaptive Filters and their Applications: This chapter provides a formulation of 
experiments in the applications of adaptive filtering. Included is an introduction to the theory 
and implementation of adaptive FIR filters with applications to system identification, inter- 
ference suppression, narrowband frequency enhancement, adaptive equalization and echo 
cancellation. 



The book is an outgrowth of our teaching of an undergraduate laboratory course in DSP. This 
laboratory course containing eleven 3 '/ 2 hour sessions is taken by students CDncurrently with 
the DSP course at Northeastern University. The material described in the first four chapters is 



covered typically in 2 to 3 sessions in a tutorial setting. The experiments described in Chapter 
5 are then done in about 6 sessions. The remaining sessions are devoted to one project which 
a student chooses based on the material given in the remaining five chapter. 

The book can also be used in a graduate DSP course with projects on FIR and IIR filte -s, fast 
Fourier transforms, and adaptive filtering. Similarly a course on communication systems can 
benefit from the projects described in Chapter 9. The book can also be used as a sell -study 
guide by anyone interested in practical DSP implementations or it can be used in an industrial 
setup for evaluation. 

The book contains several program listings. These listings are either example programs, sub- 
routines or computer files generated by the software development tools. These listings an given 
in their own "Listing" blocks which are serialized and referred to in the text. The example 
programs and subroutines have a header information containing the name of the respective 
computer file. These files are provided on a diskette which is available from Analog Devices, 
Inc. (In order to receive this diskette call DSP Division Applications Engineering at (617) 
461-3672.) In addition, several exercise solutions are also available on the diskette. The user 
should work on his or her exercise then compare solutions with those given on the diskette. The 
complete information about the files contained on the diskette is available in READMF.TXT 
file. 

The software development tools described in this book are available for the IBM- AT compatible 
personal computers. The Cross-Software used is version 2 or later and works on any 286- or 
386-based systems. The installation of the Cross-Software requires PC-DOS 3.0 or later, 640 
KB memory, and the directive "FILES=25" in the CONFIG.SYS file. Additionally, a hard disk 
and a color display system is highly recommended. 

We would like to thank Analog Devices, Inc. for its generous support of our DSP Laboratory 
and encouragement for converting our lab manual into this book. The staff of the DSP Division 
and its Applications Group at Analog Devices, Inc. provided some material and programs 
reported in this book. In particular we would like to thank Bob Fine and Steve Cox who patiently 
answered our many questions, gave advice and explanations, and provided feedback an the 
earlier versions of the manuscript. 

We are also indebted to our graduate students Eric Seto, Anil Shrestha, and Yiduk Kwon who 
helped us design and execute the experiments and projects on the development system. Finally, 
we wish to thank Mr. Hans Rempel of Analog Devices, Inc. who helped on the final preparation 
of the manuscript. 

Vinay K. Ingle 
John G. Proakis 

Boston, Massachusetts 
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chapter 1 

INTRODUCTION TO THE 
ADSP-21 00/21 01 FAMILY 

1. 1 1NTRODUCTION 

The ADSP-2 1 xx is a family of programmable single-chip processors optimized for digital signal 
processing (DSP) and other high-speed numeric processing applications. These processors use 
a modified Harvard architecture with separate buses for data and instructions. They also 
incorporate computational units, data address generators and a program sequencer in one device 
along with some necessary electronics. 

The ADSP-2 100 is a single-chip microprocessor with both program and date, buses 
extending off chip. It requires external data and program memories, and contains three full 
function and independent computational units: an arithmetic/logic unit, a multiplier/accumulator 
and a barrel shifter. These computational units process 16-bit data directly and provide for 
multiprecision computation. Two data address generators and program sequencer provide 
address and together they allow computational operations to execute with maximum efficiency. 
Figure 1-1 shows the ADSP-2100 internal architecture. 

The ADSP-2 101 on the other hand is a single-chip microcomputer based on the \DSP- 
2100 and contains additional on-chip program and data memory, two serial ports, a timer and 
extensive interrupt capabilities. It has IK words of (16-bit) data memory RAM and 2K words 
of (24-bit) program memory RAM on the chip. The processor can fetch an operand from on-chip 
data memory, an operand from on-chip program memory and the next instruction from the 
on-chip program memory in one single cycle. This internal bus structure is extended cff-chip 
via a single external memory address bus and data bus. Figure 1-2 is an overall block diagram 
of the ADSP-2 101. 

The ADSP-2 101 microcomputer is fabricated in a high-speed 1.0 micron double-layer 
metal CMOS process and operates at internal clock rate of 50MHz. With an external c ock on 
crystal at 1 2.5 MHz, every instruction executes in a single cycle of 80 ns. Fabrication in CMOS 
results in low power requirements. The ADSP-2 101 dissipates less than 1W under all corditions 
and no more than 80mW under standby conditions. It is available in a 68-pin Pin Grid Array 
(PGA) and a 68-lead Plasic Leaded Chip Carrier (PLCC). 
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In this book, we primarily deal with the ADSP-2101 since its architecturs is a superset of 
the ADSP-2100. When needed we will discuss the differences between the two devices. 
However the discussion will be based exclusively on the ADSP-2101. In the following section 
we provide a brief description of the internal architecture of the ADSP-2101. For a detailed 
description of all hardware eleme.nts, refer to ADSP-2101 12101 Architecture User's Manual 
[1]. In Section 1.3 we discuss basic system configuration with the ADSP-2 101. Finally in 
Section 1.4, we summarize this chapter with some features of the ADSP-2101. 
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Figure 1-2: ADSP-2101 Internal Architecture 

1.2 ARCHITECTURE OVERVIEW 

The ADSP-2101 evolved from its predecessor, the ADSP-2100. For compatibility with the 
ADSP-2 1 00, the additional features of the ADSP-2 1 1 appear in the form of new mode controls, 
new processor registers and a group of memory mapped control registers. Hence we first describe 
the core internal architecture which is common to both processors. We then describe adc itional 
hardware elements which are special to the ADSP-2101. 

1.2. 1 Core Internal Architecture 

Both the ADSP-2100 and the ADSP-2101 share a basic set of hardware elements called the 
core architecture. These elements are: 

• Arithmetic-Logic Unit (ALU) 

• Multiplier-Accumulator (MAC) 

• Barrel Shifter 

• Two Data Address Generators (DAG) 

• Program Sequencer. 

Efficient data transfer is achieved with the use of five internal buses: 

• Program Memory Address (PMA) Bus 

• Program Memory Data (PMD) Bus 

• Data Memory Address (DMA) Bus 

• Data Memory Data (DMD) Bus 

• Result (R) Bus. 
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Figure 1-3 illustrates the block diagram of this core internal architecture. 
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Figure 1-3: AD3P-2100 Family Core Internal Architecture 



The ALU performs a standard set of arithmetic and logic operations in addition to division 
primitives. The MAC performs single-cycle multiply, multiply/add and multiply/subtract 
operations. The Shifter performs logical and arithmetic shifts, normalization, c.enormalization, 
and derive exponent operations. The Shifter implements numeric format control including 
multiword floating point representations. The computational units are arranged side-by-side 
instead of serially so that the output of any unit may be the input of any unit on the next cycle. 
The internal result (R) bus directly connects the computational units to make this possible. 

All three sections contain inp tut and output registers which are accessible from the internal 
Data Memory Data (DMD) bus. Computational operations generally take their operands from 
input registers and load the result into an output register. The registers act as ;i stopover point 
for data between memory and the computational circuitry. This feature introdices one level of 
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pipelining on input, and one level on output. The R bus allows the result of a previous com- 
putation to be used directly as the input to another computation. This avoids excessive pipeline 
delays when a series of different operations are performed. 

Two dedicated data address generators and a powerful program sequencer ensure e fficient 
use of these computational units. The Data Address Generators (DAGs) provide memory 
addresses when memory data is transferred to or from the input/output registers. Eacli DAG 
keeps track of up to four address pointers. When a pointer is used for indirect addressing, it is 
post-modified by a value in a specified register. With two independent DAGs, the processor 
can generate two addresses simultaneously for dual operand fetches. 

A length value may be associated with each pointer to implement automatic modulo 
addressing for circular buffers. (The circular buffer feature is also used by the serial ports for 
automatic data transfers. Refer to Section 1.2.2 for information on Serial Ports.) DAG1 can 
supply addresses to data memory only. DAG2 can supply addresses to either the data memory 
or the program memory. Two independent address generators allow for simultaneous access of 
data stored in the program memory and data stored in the data memory. 

The Program sequencer supplies instruction addresses to the program memoiy. The 
sequencer is driven by the Instruction Register which holds the currently executing instiuction. 
The instruction register introduces a single level of pipelining into the program flow. Instr actions 
are fetched, loaded into the instruction register, and decoded during one processor cycle; and 
executed during the following cycle while the next instruction is prefetched. To m nimize 
overhead cycles, the sequencer supports conditional jumps, subroutine calls and returns in a 
single-cycle. With an internal loop counter and loop stack, the ADSP-2101 executes looped 
code with zero-overhead. No explicit jump instructions are required to loop. 

These components are supported by five internal buses: The PMA and DMA bi ses are 
used internally for the addresses associated with Program and Data Memory. The Program 
Memory Data (PMD) and Data Memory Data (DMD) buses are used for the data associated 
with the memory spaces. These two pairs of buses are multiplexed off chip to the external 
address and data buses. The BMS, DMS and PMS (pin) signals select the different iddress 
spaces. The R bus is an internal bus which serves to transfer intermediate results directly between 
the various computational sections. 

The Program Memory Address (PMA) bus is 14 bits wide allowing direct access of up to 
16K words of mixed instruction code and data. The program memory data (PMD) is 24 bits 
wide to accommodate the 24-bit instruction width. 

The Data Memory Address (DMA) bus is 14 bits wide allowing direct access of i p to 16 
K words of data. The Data Memory Data (DMD) bus is 16 bits wide. The data memcry data 
(DMD) bus provides a path for the contents of any register in the processor to be transferred to 
any other register or to any external data memory location in a single cycle. The data memory 
address comes from two sources: an absolute value specified in the instruction code (direct 
addressing) or the output of a data address generator (indirect addressing). Only indirect 
addressing is supported for data fetches from program memory. 
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The Program Memory data (PMD) bus can also be used to transfer data to and from the 
computational units through direct paths or via the PMD-DMD bus exchange unit. The 
PMD-DMD bus exchange unit permits data to be passed from one bus to the other. It contains 
hardware (PX register) to overcome the 8-bit width discrepancy between the two buses, if 
necessary. The 8-bit PX register c an be read or written as any other data register. 

Arithmetic-Logic Unit (ALU) 

The Arithmetic/Logic Unit (ALU) provides a standard set of arithmetic and logical functions. 
The arithmetic functions are add, subtract, negate, increment, decrement and absolute value. 
These are supplemented by two division primitives with which multiple cycle division can be 
constructed. The logic functions eie AND, OR, XOR (exclusive OR) and NOT. Figure 1-4 
shows a block diagram of the ALU. 

The ALU is 16 bits wide with two 16-bit input ports, X and Y, and one output port, R. 
The ALU accepts a carry-in signal (CI) which is the carry bit from the processor arithmetic 
status register (ASTAT). The ALU generates six status signals: the zero (AZ) status, the negative 
(AN) status, the carry (AC) status, the overflow (AV) status, the X-input sign (AS) status, and 
the quotient (AQ) status. All arithmetic status signals are latched into the arithmetic status 
register (ASTAT) at the end of the cycle. 

The X input port of the ALL' can accept data from two sources: the AX register file or 
the result (R) bus. The R bus connects the output registers of all the computational units, per- 
mitting them to be used as input operands directly. The AX register file is dedicated to the X 
input port and consists of two registers, AXO and AX1. These AX registers are readable and 
writable from the DMD bus. The instruction set also provides for reading these registers over 
the PMD bus, but there is no direct connection; this operation uses the DMD-PMD bus exchange 
unit. The AX register file outputs are dual-ported so that one register can provide input to the 
ALU while either one simultaneously drives the DMD bus. 

The Y input port of the ALU can also accept data from two sources: the AY register file 
and the ALU feedback (AF) register. The AY register file is dedicated to the Y input port and 
consists of two registers, AYO and AY1. These registers are readable and w r itable from the 
DMD bus and writable from the PMD bus. The instruction set also provides for reading these 
registers over the PMD bus, but there is no direct connection; this operation uses the DMD-PMD 
bus exchange unit. The AY register file outputs are also dual-ported: one AY regi ster can provide 
input to the ALU while either one simultaneously drives the DMD bus. 

The output of the ALU is loaded into either the ALU feedback (AF) register or the ALU 
result (AR) register. The AF register is an ALU internal register which allows the ALU result 
to be used directly as the ALU Y input. The AR register can drive both the DMD bus and the 
R bus. It is also loadable directly from the DMD bus. The instruction set also provides for 
reading AR over the PMD bus, but there is no direct connection; this operation uses the 
DMD-PMD bus exchange unit. 

Any of the registers associated with the ALU can be both read and written in the same 
cycle. Registers are read at the beginning of the cycle and written at the end of the cycle. A 
register read, therefore, reads the value loaded at the end of a previous cycle. A new value 
written to a register cannot be read out until a subsequent cycle. This allows an input register 
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Figure 1-4: ALU Block Diagram 

to provide an operand to the ALU at the beginning of the cycle and be updated with the next 
operand from memory at the end of the same cycle. It also allows a result register to be stored 
in memory and updated with a new result in the same cycle. See the discussion of "Multifunction 
Instructions" in the Chapter 2, "ADSP-2101 Instruction Set Overview", for an illustration of 
this same-cycle read and write. 
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The ALU contains a duplicate bank of registers, shown in Figure 1 -4 behind the primary 
registers. There are actually two sets of AR, AF, AX, and AY register files. Only one bank is 
accessible at a time. The additional bank of registers can be activated (such as during an interrupt 
service routine) for extremely fast context switching. A new task, like an interrupt service 
routine, can be executed without transferring current states to storage. The selection of the 
primary or alternate bank of registers is controlled by bit in the processor mcde status register 
(MSTAT). If this bit is a 0, the primary bank is selected; if it is a 1, the secondary bank is 
selected. 

Multiplier-Accumulator (MAC) 

The Multiplier/Accumulator provides high-speed multiplication, multiplication with cumula- 
tive addition, multiplication with cumulative subtraction, saturation and clear-to-zero functions. 
A feedback function allows part of the accumulator output to be directly used as one of the 
multiplicands on the next cycle. Figure 2.5 shows a block diagram of the multip ier/accumulator. 

The multiplier has two 1 6-bi : input ports X and Y, and a 32-bit product output port P. The 
32-bit product is passed to a 40-bi t adder/subtractor which adds or subtracts the new product 
from the content of the multiplier result (MR) register, or passes the new product directly to 
MR. The MR register is 40-bits wide. In this manual, we refer to the entire register as MR. The 
register actually consists of three smaller registers: MRO andMRl, which are 16 bits wide, and 
MR2, which is 8 bits wide. 

The adder/subtractor is greater than 32 bits to allow for intermediate overflow in a series 
of multiply/accumulate operations. The multiply overflow (MV) status bit is set when the 
accumulator has overflowed beyond the 32-bit boundary, that is, when there are significant 
(non-sign) bits in the top nine bits of the MR register (based on twos-complement arithmetic). 

The input/output registers of the MAC section are similar to the ALU. The X input port 
can accept data from either the MX register file or from any register on the res ult (R) bus. The 
R bus connects the output registers of all the computational units, permitting them to be used 
as input operands directly. There are two registers in the MX register file, MXO md MX 1 . These 
registers can be read and written from the DMD bus. The MX register file output > are dual-ported 
so that one register can provide input to the multiplier while either one simultaneously drives 
the DMD bus. 

The Y input port can accept data from either the MY register file or the MF register. The 
MY register file has two registers, MYO and MY1 ; these registers can be read and written from 
the DMD bus and written from the PMD bus. The ADSP-2101 instruction set also provides for 
reading these registers over the PMD bus, but there is no direct connection; this operation uses 
the DMD-PMD bus exchange unit. The MY register file outputs are also dud-ported so that 
one register can provide input to the multiplier while either one simultaneously drives the DMD 
bus. 



The output of the adder/subtractor goes to either the MF register or the MR register. The 
MF register is a feedback register which allows bits 16-31 of the result to be jsed directly as 




the multiplier Y input on a subsequent cycle. The 40-bit adder/subtractor register (MR) is c ivided 
into three sections: MR2, MR 1 , and MRO. Each of these registers can be loaded directly from 
the DMD bus and output to either the DMD bus or the R bus. 
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Any of the registers associated with the MAC can be both read and writ ten in the same 
cycle. Registers are read at the beginning of the cycle and written at the end of the cycle. A 
register read, therefore, reads the value loaded at the end of a previous cycle. A new value 
written to a register cannot be read out until a subsequent cycle. This allows an input register 
to provide an operand to the MAC at the beginning of the cycle and be updated with the next 
operand from memory at the end of the same cycle. It also allows a result register to be stored 
in memory and updated with a new result in the same cycle. 

The MAC contains a duplicate bank of registers, shown in Figure 1-5 behind the pri mary 
registers. There are actually two sets of MR, MF, MX, and MY register files. Only one bank is 
accessible at a time. The additional bank of registers can be activated for extremely fast context 
switching. A new task, such as an interrupt service routine, can be executed without transferring 
current states to storage. The selection of the primary or alternate bank of registers is controlled 
by bit in the processor mode status register (MSTAT). If this bit is a 0, the primary bank is 
selected; if it is a 1 , the secondary bank is selected. 

Barrel Shifter 

The shifter unit provides a complete set of shifting functions for 16-bit inputs, yielding a 32-bit 
output. These include arithmetic shift, logical shift and normalization. The Shifler also performs 
derivation of exponent and derivation of common exponent for an entire block of numbers. 
These basic functions can be combined to efficiently implement any degree of numerical format 
control, including full floating-point representation. Figure 1-6 shows a block diagram of the 
shifter. 

The shifter section can be divided into the following components: the shifter array, the 
OR/PASS logic, the exponent detector, and the exponent compare logic. 

The shifter array is a 1 6x32 barrel shifter. It accepts a 1 6-bit input and can place it any where 
in the 32-bit output field, from off-scale right to off-scale left, in a single cyde. This gives 49 
possible placements within the 32 -bit field. The placement of the 16 input bits is determined 
by a control code (C) and a HI/LO reference signal. 

The shifter array and its associated logic are surrounded by a set of registers. The shifter 
input (SI) register provides input to the shifter array and the exponent detector. The SI register 
is 1 6 bits wide and is readable and w ritable from the DMD bus. The shifter array and the exponent 
detector also takes as inputs AR, SR or MR via the R bus. The shifter result (SIR) register is 32 
bits wide and is divided into two 16-bit sections, SRO and SRI. The SRO and SIR1 registers can 
be loaded from the DMD bus and output to either the DMD bus or the R bus. The SR register 
is also fed back to the OR/PASS logic to allow double-precision shift operations. 

The SE register ("shifter exponent") is 8 bits wide and holds the exponent during the 
normalize and denormalize operations. The SE register is loadable and readable from the lower 
8 bits of the DMD bus. It is a twos-complement, integer value. 

The SB register ("shifter block") is important in block floating-point opsrations where it 
holds the block exponent value, that is, the value by which the block values must be shifted to 
normalize the largest value. SB is 5 bits wide and holds the most recent block exponent value. 
The SB register is loadable and readable from the lower 5 bits of the DMD dus. It is a twos- 
complement, integer value. 
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Figure 1-6: Shifter Block Diagram 

Whenever the SE or SB registers are output onto the DMD bus, they are sign-exi ended 
to form a 16-bit value. Any of the SI, SE or SR registers can be read and written in the same 
cycle. Registers are read at the beginning of the cycle and written at the end of the cycle. All 
register reads, therefore, read values loaded at the end of a previous cycle. A new value written 
to a register cannot be read out until a subsequent cycle. This allows an input register to provide 
an operand to the Shifter at the beginning of the cycle and be updated with the next opeiand at 
the end of the same cycle. It also allows a result register to be stored in memory and updated 
with a new result in the same cycle. 

The shifter section contains a duplicate bank of registers, shown in Figure 1-6 behind the 
primary registers. There are actually two sets of SE, SB, SI, SRI, and SRO registers. Only one 
bank is accessible at a time. The additional bank of registers can be activated for extremely fast 
context switching. A new task, such as an interrupt service routine, can then be executed w ithout 
transferring current states to storage. The selection of the primary or alternate bank of registers 
is controlled by bit in the processor mode status register (MSTAT). If this bit is a 0, the primary 
bank is selected; if it is a 1 , the secondary bank is selected. 
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The control code is an 8-bit signed value which indicates the direction and number of places 
the input is to be shifted. Positive codes indicate a left shift (upshift) and negative codes indicate 
a right shift (downshift). The control code can come from three sources: the content of the 
shifter exponent (SE) register, the negated content of the SE register or an immediate value 
from the instruction. 

The HI/LO signal determines the reference point for the shifting. In the HI state, all shifts 
are referenced to SRI (the upper half of the output field), and in the LO state, all shifts are 
referenced to SRO (the lower half). The HI/LO reference feature is useful when shifting 32-bit 
values since it allows both halves cf the number to be shifted with the same cor trol code. HI/LO 
reference signal is selectable each time the shifter is used. 

The shifter fills any bits to the right of the input value in the output field with zeros, and 
bits to the left are filled with the extension bit (X). The extension bit can be fed by three possible 
sources depending on the instruction being performed. The three sources are the MSB of the 
input, the AC bit from the arithmetic status register (ASTAT) or a zero. 

The OR/PASS logic allows the shifted sections of a multiprecision number to be combined 
into a single quantity. When PASS is selected, the shifter array output is passed through and 
loaded into the shifter result (SR) register unmodified. When OR is selected, the shifter array 
is bitwise ORed with the current contents of the SR register before being loaded there. 

The exponent detector derives an exponent for the shifter input value. The exponent 
detector operates in one of three ways which determine how the input value is interpreted. In 
the HI state, the input is interpreted as a single precision number or the uppei half of a double 
precision number. The exponent detector determines the number of leading sign bits and pro- 
duces a code which indicates how many places the input must be up-shifted to eliminate all but 
one of the sign bits. The code is negative so that it can become the effective exponent for the 
mantissa formed by removing the redundant sign bits. 

In the Hi-extend state (HIX), the input is interpreted as the result of an add or subtract 
performed in the ALU section which may have overflowed. Therefore the exponent detector 
takes the arithmetic overflow (AV) status into consideration. If AV is set, then a +1 exponent 
is output to indicate an extra bit is needed in the normalized mantissa (the ALU Carry bit); if 
AV is not set, then Hi-extend functions exactly like the HI state. When perf orming a derive 
exponent function in HI or Hi-extend modes, the exponent detector also outputs a shifter sign 
(SS) bit which is loaded into the arithmetic status register (ASTAT). The sign bit is the same 
as the MSB of the shifter input except when AV is set; when AV is set in the Hi-extend state, 
the MSB is inverted to restore the sign bit of the overflowed value. 

In the LO state, the input is interpreted as the lower half of a double precision number. In 
the LO state, the exponent detector interprets the SS bit in the arithmetic status register (ASTAT) 
as the sign bit of the number. The SE register is loaded with the output of the exponent detector 
only if SE contains PI 5. This occurs only when the upper half — which must be processed 
first — contained all sign bits. The exponent detector output is also offset by P 16 to account for 
the fact that the input is actually the lower half of a 32-bit value. 
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The exponent compare logic is used to find the largest exponent value in an array of shifter 
input values. The exponent compare logic in conjunction with the exponent detector de "ives a 
block exponent. The comparator compares the exponent value derived by the exponent detector 
with the value stored in the shifter block exponent (SB) register and updates the SB register 
only when the derived exponent value is larger than the value in the SB register. 

Data Address Generators (Dag) 

The ADSP-2101 contains two independent data address generators so that both program and 
data memories can be accessed simultaneously. The DAGs provide indirect addressing capa- 
bilities and perform automatic address modification. For circular buffers, the DAGs can perform 
modulo address modification. The two DAGs differ: DAG1 only generates data m;mory 
addresses, but provides an optional bit-reversal capability; DAG2 can generate bot i data 
memory and program memory addresses, but has no bit-reversal capability. 

Figure 1-7 shows a block diagram of a single data address generator. There are three 
register files: the modify (M) register file, the index (I) register file, and the length (L) register 
file. Each of the register files contains four 14-bit registers which can be read from and v/ritten 
to via the DMD bus. 



INSTRUCTION 



FROM 
INSTRUCTION 




Figure 1-7: Data Address Generator Block Diagram 

The I registers (10-3 in DAG1, 14-7 in DAG2) contain the actual addresses used to access 
memory. When data is accessed the in indirect mode, the address stored in the selected I register 
becomes the memory address. With DAG1, the output address can be bit-reversed by setting 
the appropriate mode bit in the mode status register (MSTAT) as discussed below. Bit-re /ersal 
facilitates FFT addressing. 
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The data address generator employs a post-modify scheme. After an indirect data access, 
the specified M register (M0-3 in DAG 1 , M4-7 in DAG2) is added to the specified I register to 
generate the new I value. The choice of the I and M registers are independent within each DAG. 
In other words, any register in the 10-3 set may be modified by any register in the M0-3 set in 
any combination, but not by those in DAG2 (M4-7). The modification values stored in M 
registers are signed numbers so thc.t the next address can be either higher or lower. The address 
generators support both linear addressing and circular addressing. The value of the L register 
determines which addressing scheme is used. For circular buffer addressing, the L register is 
initialized with the length of the buffer. For linear addressing, the modulus logic is disabled by 
setting the corresponding L register to zero. 

L registers and I registers are paired and the selection of the L register (L0-3 in DAG1, 
L4-7 in DAG2) is determined by the I register used. Each time an I register is selected, the 
corresponding L register provides the modulus logic with the length information. If the sum of 
the M register content and the I register content crosses the buffer boundary, the modified I 
register value is calculated by the modulus logic using the L register value. 

All data address generator registers (I, M, and L registers) are loadable and readable from 
the lower 14 bits of the DMD bus. Since I and L register contents are considered to be unsigned, 
the upper 2 bits of the DMD bus are padded with zeros when reading them. M register contents 
are signed; when reading an M register, the upper 2 bits of the DMD bus are sign-extended. 

The modulus logic implements automatic pointer wraparound for accessing circular 
buffers. To calculate the next address, the modulus logic uses the following information. 

• The current location; found in the I register (unsigned) 

• The modify value; found in the M register (signed) 

• The buffer length; found in the L register (unsigned) 

• The buffer base address 

From these inputs, the next address is calculated with the formula: 

Next address = (I + M-B) Modulo (L) + B 



I = current address, 

M = modify value (signed) 

B = base address (generated by the linker) 

L = buffer length M+ 

I = modified address 

IMI < L(this insures that the next address cannot wrap around the buffer more than once 
in one operation). 

Program Sequencer 

The program sequencer generates a stream of instruction addresses, and provides flexible control 
of program flow. It provides for zero-overhead looping, single-cycle branching ( both conditional 
and unconditional) and sophisticated interrupt processing. Figure 1-8 shows a block diagram 
for the program sequencer and status sections of the ADSP-2101. 



where: 
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Figure 1-8: ADSP-2101 Program Sequencer 



The sequencing logic controls the flow of ADSP-2101 program execution by outputting 
a program memory address onto the PMA bus from one of the following four possible sources: 
PC incrementer, PC stack, instruction register, interrupt controller. The next address cource 
selector in the diagram controls which of these four sources are output from the next address 
multiplexer, based on outputs from the instruction register, condition logic, loop comparator, 
and interrupt controller. A fifth possibility for the next program memory address, although not 
part of the program sequencer, is DAG2 when a register indirect jump is executed. 

The PC incrementer is selected as the source of the next program memory address if 
program flow is sequential. This is also the case when a conditional jump or return is not taken 
and when a DO UNTIL loop terminates (see below for a description of the DO UNTIL cons truct). 

The PC stack is used as the source for the next program memory address when a return 
from subroutine or return from interrupt is executed. The top stack value is also used as the next 
program memory address when returning to the top of a DO UNTIL loop. 
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The instruction register is selected by the next address multiplexer when a direct ji mp is 
taken. The jump address field of the instruction word itself specifies the jump address. 

The interrupt controller provides the next program memory address when process ng an 
interrupt. Upon recognizing an interrupt, the processor jumps to the interrupt vector lo :ation 
corresponding to the active interrupt request. The interrupt vector locations a - e four program 
memory locations apart; this allows short service routines to be coded in place. For onger 
routines, control is transferred to the interrupt service routine by means of a jump instrjction 
at the interrupt vector. 

DAG2 sources the next program memory address when executing a register indirect jump. 
In this case, since DAG2 is not an input to the next address multiplexer, the program c Dunter 
must be loaded from the PMA bus. 

The program sequencer section contains six status registers. These arc the Arithmetic 
Status register ( ASTAT), the Stack Status register (SSTAT), the Mode Status register (MS TAT), 
the Interrupt Control register (ICNTL), the Interrupt Mast register (IMASK) and the In errupt 
Force and Clear register (IFC). 

Interrupts 

The interrupt controller allows the processor to respond to the six possible interrupts with a 
minimum of overhead. Individual interrupt requests are logically ANDed with the bits in 
IMASK; the highest priority unmasked interrupt is then selected. 

The interrupt control register, ICNTL, allows each interrupt to be set as either edge- or 
level-sensitive. Depending on bit 4 in ICNTL, interrupt routines can either be nested with higher 
priority interrupts taking precedence or processed sequentially with only one interrupt service 
active at a time. 

The 12-bit interrupt force ar d clear register, IFC, is a write-only register that cor tains a 
force bit and a clear bit for each or" the six possible interrupts. 

When responding to an interrupt, the status registers ASTAT, MSTAT, IMASK are rushed 
onto the status stack and the PC counter is loaded with the appropriate vector ac dress. The status 
stack is seven levels deep to allow interrupt nesting. The stack is automaticall y popped when a 
return from the interrupt is executed. 

The vector addresses for each interrupt are fixed. In the ADSP-2101 each vector location 
identifies a block of four instructions. Short service routines can be executed without an addi- 
tional JUMP, minimizing overhead. 

IMASK 

IMASK is six bits wide and allows the interrupt inputs to be individually enabled or disabled. 
The bits in IMASK are: 
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Timer interrupt enable 

1 IRQO or SPORT1 receive interrupt enable 

2 IRQ! or SPORT 1 transmit interrupt enable 

3 SPORTO receive interrupt enable 

4 SPORTO transmit interrupt enable 

5 IRQ2 interrupt enable 



The bits are all positive sense (0=disabled, l=enabled). IMASK is set to zero upon a 
processor reset so that all interrupts are disabled initially. 



ICNTL is a 5-bit register configuring the interrupt modes of the processor. The bits in ICNTL 
are: 



The sensitivity bits determine whether a given interrupt input is edge- or level-sensitive 
(0 = level-sensitive, 1 = edge-sensitive). 

The interrupt nesting mode determines whether higher priority interrupt service routines 
are automatically nested. When the nesting mode bit is cleared, all IMASK bits are automatically 
cleared when an interrupt service routine is entered, so that all interrupts are masked (no nesting 
can occur). The previous IMASK value is pushed on the stack. When the nesting mode bit is 
set, only the bits in IMASK for equal and lower priority interrupts are cleared. Higher priority 
interrupts that are not masked can interrupt the current interrupt service routine. This va ue of 
IMASK can also be changed at any time to allow other nesting schemes. 

Edge-triggered interrupts are automatically cleared when the interrupt service routine is 
called. They can also be cleared by writing a one to the appropriate IFC bit. The Timer and 
Serial Port interrupts act as edge-sensitive interrupts which can be masked, cleared or f Dreed 
with software. 



The write-only IFC register is twelve bits wide and contains a bit for clearing and a bit for 
forcing each of the six possible interrupts in the ADSP-2101. The bits in IFC are defined as 
follows. 



ICNTL 



I 
1 

2 
3 
4 



IRQO sensitivity (if IRQO is configured) 



IRQ1 sensitivity (if IRQ 1 is configured) 

IRQ2 sensitivity 

zero 

Interrupt Nesting Mode 



IFC 



BitO 
Bit 1 
Bit 2 
Bit 3 
Bit 4 
Bit 5 



Timer interrupt clear 



SPORT1 receive or IRQO interrupt clear 
SPORT1 transmit or IRQ1 interrupt cleai 
SPORTO receive interrupt clear 
SPORTO transmit interrupt clear 
IRQ2 interrupt clear 
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Bit 6 Timer interrupt force 

Bit 7 SPORT 1 receive or IRQO interrupt force 

Bit 8 SPORT1 transmit or IRQ1 interrupt force 

Bit 9 SPORT0 receive interrupt force 

Bit 1 SPORT0 transmit interrupt force 

Bit 1 1 IRQ2 interrupt force 

Pending edge-sensitive interrupts can be cleared by writing a one to the appropriate clear 
bit (0-5) in IFC. Edge-triggered interrupts are cleared automatically when the corresponding 
>t service routine is called. 



Edge-sensitive interrupts can be forced under program control by writing a one to the 
force bit (6-1 1) corresponding to the desired interrupt. This causes the chip to respond as though 
the interrupt had occurred. Forcing a level-sensitive interrupt has no effect. The Tinier and 
SPORT interrupts behave like edge-sensitive interrupts and can be masked, cleared and forced. 

Loop Mechanisms 

Loop stack and comparator provides the zero-overhead looping mechanism. A DO UNTIL 
instruction contains the address fcr the end of the loop and the termination condition. When a 
DO UNTIL instruction is executed, this information is loaded into the loop stack, and the PC 
value is pushed onto the PC stack after being incremented. The loop compara tor compares the 
end of loop address with the next address, and signals the end of loop when tie two are equal. 
The processor then checks if the termination condition is met. Depending cm this condition, 
the next address selector chooses between the PC stack (jump to beginning of loop) and the PC 
incrementer (fall out of loop). The loop stack is four level deep, permitting four levels of 
zero-overhead loop nesting. 

The down counter and the count stack also support this powerful looping mechanism. The 
down counter is a 14-bit register with auto-decrement capability. It is loaded from the DMD 
bus with the loop count. The count is decremented every time the counter value is checked; 
when the count expires, the counter expired (CE) flag is set. The count stack allows the nesting 
of loops by storing temporarily dormant loop counts. When a new value is loaded into the 
counter from the DMD bus, the current counter value is automatically pushed onto the count 
stack, as program flow enters a loop. The count stack is automatically popped whenever the CE 
flag is tested and is true, thereby resuming execution of the code outside the loop. It is also 
possible to overwrite the counter, without pushing its value on the count stack, if loop nesting 
is not occurring. 

Status Registers 

The ADSP-2101 maintains six status registers, which can be accessed over the DMD bus (one 
is read-only and one is write-only, however). These registers are: 

ASTAT Arithmetic Status register 

SSTAT Stack Status register (read-only) 

MSTAT Mode Status register 
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ICNTL Interrupt Control register 

IMASK Interrupt Mask register 

IFC Interrupt Force and Clear (write-only) 

The interrupt registers are described in a previous section; the other three are discussed 
below. 

ASTAT 

ASTAT is 8 bits wide and holds the status information generated by the computational sections 
of the processor. The bits in ASTAT are defined as follows: 






AZ 


(ALU result zero) 


1 


AN 


(ALU result negative) 


2 


AV 


(ALU overflow) 


3 


AC 


(ALU carry) 


4 


AS 


(ALU X input sign) 


5 


AQ 


(ALU quotient flag) 


6 


MV 


(MAC overflow) 


7 


SS 


(Shifter input sign) 



The bits are positive sense (l=true, 0=false). They are automatically updated when i new 
status is generated by the arithmetic operations affecting them, as defined by the following table: 



Status Bit Updated on: 

AZ, AN, AV, AC Any ALU operation except division 

AS ALU absolute value operation 

AQ ALU divide operations 

MV Any MAC operation except saturate MR 

SS Shifter exponent detect operation 



The computation condition codes are described in Chapter 2. 
SSTAT 

SSTAT is 8 bits wide and holds the status of the four internal stacks. The bits in SSTAT are: 

PC Stack Empty 

1 PC Stack Overflow 

2 Count Stack Empty 

3 Count Stack Overflow 

4 Status Stack Empty 

5 Status Stack Overflow 

6 Loop Stack Empty 

7 Loop Stack Overflow 

All of the bits are positive sense (l=true, 0=false). The empty status bits indicate th it the 
stack is empty. The overflow status bits indicate that the stack has overflowed. Since the stack 
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overflow status bits "stick" once t ley are set, subsequent pop operations have no effect on them. 
This means that the stack can be both overflowed and empty under certain circumstances. A 
processor reset or a software reboot must be executed to clear the stack overflow statu s. 

MSTAT 

MSTAT is a 7-bit register that defines various operating modes of the proc essor. The Mode 
Control instruction enables or disables the operating modes. The bits in MSTAT are: 

Data Register E ank Select 

1 Bit Reverse Mcde (DAG1 only) 

2 ALU Overflow Latch Mode 

3 AR Saturation Mode 

4 MAC Result P Placement Mode 

5 Timer Enable 

6 Go Mode 



The data register bank select bit determines which set of data registers is currently active 
(0=primary, l=secondary). The data registers include all of the result and input registers to the 
ALU, MAC, and Shifter (AXO, AX1, AYO, AY1, AF , AR, MXO, MX1, MYO, MY1,MF, MRO, 
MR1 , MR2, SB, SE, SI, SRO and SRI). At RESET, the data register bank select bit is cleared. 
The bit reverse mode, when enab ed, bit-wise reverses all addresses generated by DAG1. This 
is most useful for re-ordering the input or output data in a radix-2 FFT algorithm. The ALU 
overflow latch mode causes the AV (ALU overflow) status bit to "stick" once it is set. In this 
mode, when an ALU overflow occurs, AV will be set and remain set, even if subsequent ALU 
operations do not generate overflows. AV can then only be cleared by writing a zero intc it from 
the DMD bus. The AR saturation mode, when set, causes ALU results to be saturated to the 
maximum positive (0x7FFF' ) or n ;gative (0x8000) values when an ALU overflow or underflow 
occurs. The MAC Result P Placement bit, when set to 0, results in the ADSP-2100 result 
placement of the multiplier product in the MR register (one bit shift). When th is bit is 1 , no shift 
occurs. The Timer Enable bit, whsn set to I, enables the timer decrement me:hanism. The Go 
Mode bit, when set to 1, allows th; processor to continue operations internally (when possible) 
while the external address and dai a buses are tristated during a bus grant. 

1.2.2 Additional Features of the ADSP-2101 

Being a microcomputer, the ADSP-2101 contains supplementary internal hardware elements 
so that a basic system configuration can be built with a minimum number of external devices. 
These elements include a timer, two serial ports (SPORT), boot address genen.tor, program and 
data memories. 

Timer 

The ADSP-2101 contains a programmable interval timer to generate periodic interrupts based 
on multiples ofthe processor's cycle time. Figure 1-9 shows the timer block diagram. Itincludes 



1 Throughout this book, hexadecimal numbers are denoted by "Ox" prefix. 
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two 1 6-bit registers, TCOUNT and TPERIOD and one 8-bit register TSCALE. These reg isters 
are memory mapped. The extended mode control instruction enables and disables the tinier by 
setting and clearing bit 5 in the mode status register, MSTAT. 
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Figure 1-9: Timer Block Diagram 

TCOUNT is a count register. When the timer is enabled, it is decremented as oft;n as 
once every instruction cycle. When the counter reaches zero, an interrupt is generated. TCO UNT 
is then reloaded from the TPERIOD register and the count begins again. TSCALE stores a 
scaling value that is one less than the number of cycles between decrements of TCOUNT For 
example, if the value in TSCALE register is 0, the counter register decrements once every cycle. 
If the value in TSCALE is 1 , the counter decrements once every 2 cycles. In a processor with 
an 80ns cycle time, for example, the timer interrupt could occur as infrequently as every 1 .34 
seconds if a maximum scaling value is used. With an 80ns resolution, a maximum period of 
5.24ms can be timed. 

Serial Ports 

The ADSP-2101 incorporates two complete serial ports, SPORT0 and SPORT 1, for serial 
communications and multiprocessor coordination. Each serial port has a 5-pin interface con- 
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Signal Name 



Function 



SCLK 

RFS 

TFS 

DR 

DT 



Serial Clock I/O 
Receive frame synch I/O 
Transmit frame synch I/O 
Serial data receive 
Serial data transmit 



Here is a brief list of the capabilities of the ADS 
simplified block diagram of a single SPORT. 
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Figurt; 1-10: Serial Port Block Diagram 

Bidirectional: each SPORT has a separate transmit and receive section. 

Double-buffered: each S PORT section (both receive and transmit) has a data register 
accessible to the user and an internal transfer register. The double-b jffering provides 
additional time to service the SPORT. 

Flexible clocking: each SPORT can use an external serial clock (from Hz to the 
processor frequency) or generate its own (from 1/2 17 of the processor frequency to 1/2 
the processor frequency). 

Flexible framing: Framings for the receive and transmit sections on each SPORT are 
independent. Each section can run in a frameless mode, with intern ally-generated or 
externally-generated frame synch signals, with active high or inverted frame signals, 
and with either of two pulse widths or timings. The receive and trans nit sections share 
the same serial clock. 
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• Flexible word length: each SPORT supports serial data word lengths from th-ee to 
sixteen bits. 

• Companding in hardware: each SPORT provides optional A-law and (X-law com- 
panding. Different companding can be used for each SPORT, for example, A-h w for 
SPORTO and n-law for SPORT1 . 

• Flexible interrupt scheme: each SPORT section (receive and transmit) can generate a 
unique interrupt upon completing a data word transfer or after transfeiTing an ;ntire 
buffer (see next item). 

• Auto-buffering with single-cycle overhead: using the ADSP-2101 DAGs, each 
SPORT can receive and/or transmit an entire circular buffer of data with an overhead 
of only one cycle per data word. Transfers to and from the SPORT and the ci -cular 
buffer are automatic in this mode and do not require additional programming. An 
interrupt is generated only when pointer wraparound occurs in the circular buffer. 

• Multichannel capability: SPORTO provides a multichannel interface for selective 
receipt and transmission of arbitrary data channels from a twenty-four or thirty-two 
word, time-division multiplexed, serial bitstream. This is especially useful for Tl or 
CEPT interfaces or as a network communication scheme for multiple processors. 

• Alternate configuration: SPORT1 can be configured as two external interrupt inputs 
(IRQO and IRQ 1 ) and the Flag In and Flag Out signals. The internally ge nerated serial 
clock may still be used in this configuration. 

Each SPORT has a receive and a transmit register; SPORTO's registers are RXO and TXO, 
SPORTl's are RX1 and TX1. Companding (a contraction of COMpressing and exPAN Ding) 
is the process of logarithmically encoding data to reduce the number of bits that must be sent. 
Both SPORTs share the companding hardware: one expansion and one compression operation 
can occur in each processor cycle. In the event of contention, SPORTO has priority The 
ADSP-2101 supports both of the widely used algorithms for companding: A-law and U-law. 
The type of companding can be independently selected for each SPORT. 

The TXn and RXn registers are identified by name in the ADSP-21 01 assembly language, 
not memory-mapped. TXn and RXn can be read and written (like other non-data registers) with 
the following instruction types: read/write to data memory (direct address), load nor -data 
immediate, and internal (register-to-register) moves. 

There are two ways to generate the SPORT interrupts: after the transmit or receipt of 1) 
each word (normal word by word operation) or 2) each complete buffer of data words (autobuffer 
operation). 

These serial port features, in conjunction with other features of the ADSP-2101, m ike it 
possible to interface to most codecs, A/Ds, DACs and to additional ADSP-2101s wi .h no 
additional hardware and little or no software overhead. 
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Memories 

The ADSP-2101 has three sepan.te memory spaces: data memory, program memory and boot 
memory. Boot memory is only active during the loading of program code from an external 
device (ROM, EPROM, or RAM). Program memory consists of a single address space 24-bit 
(3-byte) wide, 2K of which resides on the chip and the rest is external. This program memory 
is dual purpose for both instruction and data storage. Hence two program memory locations 
can be accessed in a single cycle. Data memory is a single address space 16-bit (2-byte) wide, 
1 K of which resides on the chip ;tnd the rest is external. There are separate arogram and data 
buses on the chip. 

Boot Address Generator 

To execute the boot operation, the boot address generator generates the appropriate byte 
addresses and loads the ADSP-2101 internal program memory with the contents of the external 
EPROM. The ADSP-2101 internal program memory is loaded beginning with the high 
addresses. Although 2K words 3-byte wide internal program memory requires only 6K bytes 
of storage, boot memory is organized into eight pages which are 8K bytes long. Every fourth 
byte of a page is an "empty" byte, except the first one, which contains the page length. The 
page length is read first and then bytes are loaded from the top of the page cownwards. This 
results in a shorter booting times for shorter pages. The boot address genera :or is designed to 
generate the proper sequence of addresses. 

PMD-DMD Bus Exchange 

This unit couples the program memory data bus and the data memory data bi s, allowing them 
to transfer data in both directions. Since the program memory data (PMD) bus is 24 bits wide, 
while the data memory data (DMD) bus is 16 bits wide, only the upper 16 bits of PMD can be 
directly transferred. An internal register (PX) is loaded with (or supplies) the additional 8 bits. 
This register can be directly loaded or read when the full 24 bits are required 

Figure 1-11 shows a block diagram of this circuit. There are two types of connections provided 
in this circuit. 

The first type of connection is a o ie-way path from each bus to the other. This is implemented 
with two tristate buffers connecting the DMD bus with the upper 16 bits of the PMD bus. One 
of these two buffers is normally used when data is exchanged between the program memory 
and one of the registers connected to the DMD bus. This is the path used to write data to program 
memory; it is not shown in the individual computational unit block diagrams. 

The second connection is through the PX register. The PX register is 8-bits wide and can be 
loaded from either the lower 8 bits of the DMD bus or the lower 8 bits of the PMD bus. Its 
contents can also be read to the lower 8 bits of either bus. 

External Buses 

The program memory address bus (PMA) and the data memory address (DMA) are multiplexed 
into one bus and driven off chip, likewise, the program memory data bus (PMD) and the data 
memory data bus (DMD) are multiplexed into one bus and driven off chip. The sixteen MSBs 
of the external data bus are used as the DMD bus. 
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Figure 1-11: PMD-DMD Bus Exchange 



1.3ADSP-2101 BASIC SYSTEM 

Figure 1-12 shows a basic system configuration with the ADSP-2101, two serial devices, '<.. boot 
EPROM and optional external program and data memories. Up to 15K words of data memory 
and 16K words of program memory can be supported. Programmable wait state generation 
allows the processor to interface easily to slow memories. In this section, we discuss interf icing 
of the ADSP-2101 with the external devices. 

1.3.1 System Interface 

In this section, we present several issues related to clock crystal and interrupt handling. 

Clock Signals 

The ADSP-2101 takes a TTL-compatible clock signal, CLKIN, running at the instruction rate. 
A clock output (CLKOUT) signal is generated by the processor synchronized to the processor's 
internal cycles. The rising edge of CLKOUT is aligned with the rising edge of CLKIN. CLKIN 
may not be halted, changed during operation or operated below the specified frequency. 



16 



Introduction to the ADSP2 1 00/2 1 1 Family Chap. 1 



Clock or Crystal 
1 T 



L '< '1 



CLKIN 

RESET 
iRQ2 
BR 
BG 



ADSP-2101 



RD WR ADDRI:SS DATA DMS 



SERIAL 
PORT 



SERIAL 
PORT 1 





SCLK 




* 


RFS 




• 


TFS 






DT 




DR 



cs 
6T 



WE 



(Optional) 



PROGRAM 
MEMORY 



v 



Serial Devicf 
(Optional) 



SCLK 



A 

5T 

WE 



DATA 



PERIPHERALS 



TFS or IRQ1 



DT or FO 



DR or Fl 



Serial Device 
(Op ■tonal) 



V V 



BOOT 
EPROM 

2764 
27128 
27256 
27512 

250ns 



NOTE: The two MSI Is of the Boot EPROM Address are also the two MSBs of the 
Data Bus. This is oiJy required for the 27256 and 27512. 



Figure 1-12: ADSP-2101 Basic System Configuration 
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Figure 1-13: External Crystal Connections 
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Because the ADSP-2101 contains an internal oscillator, an external crystal may be used 
in place of an external clock oscillator. The crystal should be connected across the CLKIN input 
and the XTAL input, as shown in Figure 1-13. If an external clock oscillator is used, the XTAL 
input must not be connected. 

1 

Interrupt Handling 



The ADSP-2101 provides up to three external interrupt input pins, IRQO, IRQ1 and IRQ2. 1 RQ2 
is always available as a dedicated pin; IRQ1 and IRQO may be alternately configured as p irt of 
serial port 1 . The input pins can be programmed to be either level- or edge-sensitive. The 
ADSP-2 1 1 also supports internal interrupts from the timer and the two serial ports. The inte rrupt 
levels are internally prioritized and individually maskable. The priorities of all six interrupts 
are shown below. 

The ADSP-2 101 supports a vectored interrupt scheme: when an interrupt is acknowledged, 
the processor shifts program control to the interrupt vector address corresponding to the interrupt 
level. Interrupts can optionally be nested so that a higher priority interrupt can preempt the 
currently executing interrupt service routine. Each interrupt vector location is four instructions 
in length, so that simple service routines can be coded entirely in this space. Longer routines 
require an additional JUMP or CALL. 

Source of Interrupt Interrupt Vector 



IRQ2 (external pin) 0004 {highest priority) 

SPORT0 Transmit (internal) 0008 

SPORT0 Receive (internal) 000C 

SPORT1 Transmit (internal) or IRQ 1 (external) 00 1 

SPORT 1 Receive (internal) or IRQO (external) 00 1 4 

Timer (internal) 00 1 8 (lowest priority) 



RESET Signal 

The RESET signal initiates a master reset of the ADSP-2 10 1 . The RESET signal must be assorted 
when the chip is powered up to assure proper initialization. RESET during initial power-up 
must be held long enough to allow the internal clock to stabilize. If RESET is activated sub- 
sequently, the clock continues and does not require this stabilization time. 

The master reset sets all internal stack pointers to the empty stack condition, masks all 
interrupts and clears the MSTAT register. When RESET is released, if there is no pending bus 
request and the chip is configured for booting, the boot-loading sequence is performed. Then 
the first instruction is fetched from program memory location 0x0000. 

Flag In and Flag Out Pins 



In addition to the IRQ1 and IRQ2 pins, the alternate configuration of SPORT1 provides the 
ADSP-2101 with a Flag In (FI) and a Flag Out (FO) pin. In the alternate configuration, the 
DR1 pin is redefined as Flag In and the DTI pin as Flag Out. Clearing the SPORT 1 configuration 
bit in the system control register selects the alternate configuration. 
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FI may be used to control the branching of program. FO may be set, toggled, or cleared 
in software to signal events or conditions to any other device such as a host processor. FO is 
available as a read-only bit of the SPORT 1 control register. 

1.3.2 Memory Interface 

In this section, we describe bus interconnections and program/data memory maps as well as 
interfaces. 

Program Memory Interface 

The program memory address bus (PMA) and the program memory data bus (PMD) are mul- 
tiplexed with DMA and DMD, sharing the external data and address bus. The 14-bit address 
bus directly addresses up to 16K words, of which 2K are on-chip. The data bus is bidirectional 
and 24 bits wide to external program memory. 

There is no placement restriction for instruction code and data in the jrogram memory 
space, except for the locations used for interrupt and restart vectors. 



The program memory data lines are bidirectional. The Program Memory Select (PMS) 
signal indicates access to the Program Memory and can be used as a chip .select signal. The 
Write (WR) signal indicates a write operation and can be used as a write strob;. The Read (RD) 
signal indicates a read operation and can be used as a read strobe or output enable signal. 

The ADSP-2101 writes data from its 16-bit registers to the 24-bit program memory using 
the PX register to provide the lower eight bits. When it reads data (not instruct ions) from 24-bit 
program memory to a 16-bit data register, the lower eight bits are placed in tne PX register. 

Program Memory Maps 

Program memory can be mapped in two ways, depending on the state of the MMAP pin. Figure 
1-14 shows the two configurations. When MMAP=0, the internal RAM occupies 2K words 
beginning at address 0x0000. The external program memory uses the remaining 14K words 
beginning at address 0x0800. In this configuration, the boot loading sequence (described below) 
is automatically initiated when R ESET is released. 

When MMAP=1, 14K words of external program memory begin at address 0x0000 and 
internal RAM is located in the upper 2K words, beginning at address 0x380C . In this configu- 
ration, program memory is not loaded although it can be written to and read from under program 
control. 

Boot Memory Interface 

The Boot memory space consists of an external 64K by 8 space, divided into eight separate 8K 
by 8 pages. Three bits in the system control register select which page is loaded by the Boot 
memory interface. Another bit in the system control register allows the user to force a boot 
loading sequence under software control. Boot loading from page after RliSET is initiated 
automatically if MMAP=0. 
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Figure 1-14: ADSP-2101 Program Memory Map 

The boot memory interface can generate to 7 wait states; it defaults to 3 wait states after 
RESET. This allows a 50MHz processor to use a slow (250 ns), low-cost EPROM for program 
storage. Program memory is loaded a byte at a time and converted to 24-bit words. 

The BMS and RD signals are used to select and strobe the boot memory interface. Only 
8-bit data is read over the data bus. To accommodate up to eight pages of boot memory, th; two 
MSBs of the data bus are used in the boot memory interface as the two MSBs of the boot space 
address. 

BR is recognized during the booting sequence. The bus is granted after the completion of 
loading the current byte. BR during booting may be used to implement booting under the control 
of a host processor. 
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Data Memory Interface 

The data memory address bus (DMA) and the data memory data bus (DME>) are multiplexed 
with PMA and PMD, sharing the external data and address bus. The DMA bus is 14 bits wide. 
The bidirectional external data bt s is 24 bits wide, with the upper 1 6 bits used f or DMD transfers. 



The Data Memory Select (DMS) signal indicates access to the Data Memory and can be 
used as a chip select signal. The Write (WR) signal indicates a write operaticn and can be used 
as a write strobe. The Read (RD) signal indicates a read operation and can be used as a read 
strobe or output enable signal. 

The ADSP-2101 supports memory -mapped I/O, with the peripherals memory mapped 
into the data memory address space and accessed by the processor in the sane manner as data 
memory. 

Data Memory Map 

The on-chip data memory RAM resides in the IK words of data memory beginning at address 
0x3800, as shown in Figure 1-15. In addition, data memory locations from H#3C00 to the end 
of data memory at 0x3FFF are reserved. Control registers for the system, timer, wait state 
configuration and serial port operations are located in this region of memory. 

The remaining 1 4K of data memory is external. External data memory i 5 divided into five 
zones associated with five different wait states. This allows slower peripherals to be mapped 
into zones of data memory with more wait states. Figure 1-15 shows these zones. 

Bus Interface 

The ADSP-2101 can relinquish control of the data and address buses to an external device. 
When the external device requires access to memory, it asserts the Bus Request (BR) signal. If 
the ADSP-2101 is not performing an external memory access, then it responds to the active BR 
input in the same cycle by: 

• tristating the Data and Address bus and the PMS, DMS, BMS, RD, WR output drivers, 

• asserting the Bus Grant (BG) signal, 

• completing the current instruction, and 

• halting program execution. 

If the Go mode is set, however, the ADSP-2101 will not halt program execution until it 
encounters an instruction that requires an external memory access. 

If the ADSP-2101 is performing an external memory access when th; external device 
asserts the BR signal, then it will not tristate the memory interfaces or assert the BG signal until 
the cycle after the access completes, up to eight cycles later depending on the number of wait 
states. The instruction does not need to be completed when the bus is granted; the ADSP-2101 
will grant the bus in between two memory accesses if an instruction requires more than one 
memory access. 
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Figure 1-15: ADSP-2101 Data Memory Map 

When the BR signal is released, the processor releases the BG signal, re-enables the output 
drivers and continues program execution from the point where it stopped. 

1.4 SUMMARY 

In this chapter we briefly described the architecture of the ADSP-2101 microcompi ter. It 
exhibits a high degree of parallelism, tailored to DSP requirements. The key features of this 
architecture are: 

• 12.5 MHz instruction rate, 80ns per instruction. 

• Single-cycle access to both program and data memory. 

• High degree of parallelism to: 
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— compute the next program address 

— fetch the next instruction 

— perform one two data moves 

— update two address pointers 

— perform a computation 

— receive and transmit data via two serial ports. 
• Code compatibility with ADSP-2100. 

In the next chapter we present an overview of the ADSP-2101 instruction set. The basic 
; of architecture is necessary and useful in learning intricate details i 
language programming using the instruction set. 



chapter 2 

ADSP-2101 INSTRUCTION SET 
OVERVIEW 



2.1 INTRODUCTION 

This chapter provides an overview of the instruction set used to program the ADSP-2101 chip. 
It provides enough information to understand the nature of programming the ADSP-2101 and 
the capabilities of the instruction set itself. This chapter is not a programmer's reference and 
therefore the ADSP-2101 Cross-Reference Manual [2] must be consulted for a complete ref- 
erence to the instruction set. 

For software development, the programmer must have access to the development tools: 
System Builder, Assembler, Linker, Simulator and PROM Splitter. The overview af the 
ADSP-2 1 1 development system software i s given in the next chapter. The compl ste desci iption 
of these tools is also given in the Cross-Software Manual. 

The ADSP-2101 instruction set is tailored to the computation-intensive algorithms 
common in DSP applications. This is possible because the instruction set allows data movement 
between various computational units with minimum overhead. For example, sustained 
single-cycle multiplication/accumulation operations are possible. The instruction set provides 
full control of the ADSP-2101's three computational units: the ALU, MAC and Shifter. 
Arithmetic instructions can process single precision 16-bit operands directly with pro\isions 
for multiprecision operations. The ADSP-2101 assembly language uses an al gebraic syntax 
for arithmetic operations and for data moves resulting in highly readable source code. The 
sources and destinations of computations and data moves are written explicitly, eliminating 
cryptic assembler mnemonics. There is no performance penalty for this; each program stal ement 
assembles into one 24-bit instruction which executes in one cycle. There are no muli icycle 
instruction in the ADSP-2 1 1 instruction set. Some fifty registers surrounding the computational 
units are dual purpose: they are available for general purpose on-chip storage when not used in 
computation. This saves many memory access cycles and provides excellent freedom in coding. 
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The control instructions provide conditional execution of most calculations and, in 
addition to usual the JUMP and CALL, supports a DO UNTIL looping instruction. Return from 
interrupt (RTT) and the return from subroutine (RTS) are also provided. These services are 
made compact and speedy by the single cycle context save. The contents of the primary register 
set are held constant while the alternate set is enabled for subroutine and interrupt services. This 
eliminates the cluster of PUSHe;; and POPs of stacks common in general purpose micropro- 
cessors. 

The ADSP-2101 also provides the IDLE instruction for idling the processor until an 
interrupt occurs. IDLE puts the processor into a low-power state while waiting for interrupts. 
Two addressing modes are supported for memory fetches. Direct addressing uses immediate 
values; indirect addressing uses the two Data Address Generators (DAGs). 

The 24-bit instruction word allows a high degree of parallelism in perfoiming operations. 
The instruction set allows for a single-cycle execution of any of the following combinations: 

• any ALU, MAC or Shifter operation (may be conditional) 

• any register to register move 

• any data memory read or write 

• a computation with any data register to data reg 

• a computation with any memory read or write 

• a computation with a read from two memories. 







Symbol 



DM(addr) 
PM(addr) 
[Option] 

| option a | 
I option b | 

CAPITAL LETTERS 

parameters 

<data> 
<addr> 



Mtaning 



Add, Subtract 
Multiply 
Transfer into a the contents of b 
Separates multifunction instructions 
The contents of data-memory at location "addr" 
The contents of program-memory at location "iddr" 
Anything within aquare brackets is an optional part of the 
ins ruction statement 

List of parameters enclosed by parallel vertical lines 
require the choice of one parameter from among the 
available list. 

Denote reserved words. These are instruction words, 
register names and operand selections, 
are shown in small letters and denote an operand in the 
instruction for which there are numerous choices, 
denotes an immediate data value, 
denotes an immediate value of an address to be coded in 
the instruction. 
End of instruction. 





Table 2-1 : Notation Used in Instruction Set ■ 
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The ADSP-2101 instruction set provides the programmer with maximum flexibility. The 
instruction set provides moves from any register to any other register, or from most re gisters 
to/from either memory. For combining operations, almost any ALU, MAC or Shifter operation 
may be combined with any register-to-register move or with a register move to or from either 
internal or external memory. 

There are five basic categories of instructions: computational instructions, data move 
instructions, multifunction instructions, program flow control instructions and miscellaneous 
instructions. Each of these instruction types is described in the next several sections. At the 
end of each section, tables summarizing the syntax of each instruction category is given. The 
notation used in an instruction is shown in Table 2-1 . 

2.2 COMPUTATION INSTRUCTIONS 

This group of commands execute all ALU, MAC and Shifter instructions. There are two 
functional classes: standard instructions, which include the bulk of the computation operations, 
can be executed conditionally (IF condition...) which test the ALU status register, and may be 
combined with a data transfer in single-cycle multifunction instructions; and spec ial instmctions 
which form a small subset and must be executed individually. The permissible conditions are 
listed in Table 2-2. 



Condition Keyword 



ALU result is: 



equal to zero 


EQ 




not equal to zero 


NE 




greater than zero 


6T 




greater than or equal to zero 


6E 




less than zero 


LT 




less than or equal to zero 


LE 




ALU carry status: 






carry 


AC 




not carry 


NOT 


AC 


x-input sign: 






positive 


POS 




negative 


NEG 




ALU overflow status: 






overflow 


AV 




not overflow 


NOT 


AV 


MAC overflow status: 






overflow 


MV 




not overflow 


NOT 


MV 


Counter status: 






not expired 


NOT 


CE 



Table 2-2: Permissible Conditions for Computation Instructions 



Each computational unit has a set of input registers and output registers. A list )f per- 
missible input operands and result registers are given in Table 2-3. 
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ALU 






Source for X input (xop) 


Source for Y input (yop) 


Destination fcr output port R 


A Yfl A Y 1 AD 

MRO, MR1, MR2 
SRO, SRI 


AVf> AVI 

AF 


AR 
AF 


MAC 






Source for X input (xop) 


Source for Y input (yop) 


Destination for output port R 


MXO, MX1, AR 
MRO, MR1, MR2 
SRO, SRI 


VYO, MY1 




MR (MR2, MR1, MRO) 
MF 



Source for Shifter input (xop) 


Destination for Shifter output 


SI, SRO, SRI 


SR (SRI, SRO) 


AR 




MRO, MR1, MR2 





Table 2-3: Computational Input/Output Registers • 



2.2.1 ALU Group 

Standard Functions: Standard ALU instructions include add, subtract, logic (AND, OR, 
NOT, eXclusive-OR), pass, negate, increment, decrement, clear, and absolute value. The "-" 
function does twos-complement st btraction while NOT obtains a ones-complement. The PASS 
function passes the listed operand but tests and stores status information for later sign/zero 
testing. As an example, consider an ALU addition instruction for add/add-with-carry in the 
form: 



[IF condition] 



AR 
AF 



xop 



+ yop 
+ C 

+ yop + C 



Instructions are in similar form for subtraction and logical operations. If the options AR and 
"+ yop + C" are chosen, and if xop and yop are the contents of AXO and AYO respectively, the 
unconditional instruction would read: 



AR = AXO + AYO + C; 

This algebraic expression means that the A] 
x-input and y-input registers plus the value of the carry-in bit. This 
execution by eliminating many separate register-move instructions. 



the value of the ALU 
code and speeds 
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When an optional IF condition is included, and if ALU Carry bit status is chosen then the 
conditional instruction would read: 

IF AC M = AXO + AYO + C; 

The conditional expression, IF AC, tests the ALU Carry bit; if there is a carry frcm the previous 
instruction, this instruction executes, otherwise a NOP occurs and execution continues with the 
next instruction. 

Special Functions: The division instruction is the only ALU special function. It is executed 
in two steps: DIVS computes the sign, then DIVQ computes the quotient. A full diviie of a 
signed 16-bit divisor into a signed 32-bit quotient requires a DIVS followed by 15 DIVQ's. 

Table 2-4 is a list of all ALU instructions. 



[IF condition] 



[IF condition] 



[IF condition] 



[IF condition] 



AR 
AF 



AR 
AF 



AR 
AF 



AR 
AF 



[IF condition] AR 
AF 



[IF condition] I AR 
| AF 

[IF condition] I AR 
| AF 

[IF condition] I AR 
| AF 

[IF condition] I AR 
| AF 

[IF condition] I AR 
AF 



DIVS yop, xop 
DIVQ xop ; 

Table 2-4: ALU Instructions — 



xop 



xop 



yop 



+ yop 
+ C 

+ yop + 
C 

- y°p 

- yop + C - 
1 

■ xop 

■ xop + C - 



AND 

OR 

XOR 



yop 



PASS 



NOT 



yop 
yop 



xop 
yop 



1 

xop 
yop 

xop 
yop 

xop 
+ 1 
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2.2.2 MAC Group 

Standard Functions: Standard MAC instructions include multiply, multiply accumulate, 
multiply-subtract, transfer AR conditionally, and clear. As an example, consider a MAC 
instruction for multiply-accumulate in the form: 



[IF condition] 



MR + xop * yop ( 



SS 
SO 
US 
UO 



) ; 



MR 
MF 



If the options "MR" and "UU" ars chosen, if xop and yop are the contents of MXO and MYO 
respectively, and if MAC overfle w condition is chosen, then a conditio 
read: 

IF NOT MV MR = MR + MXO*MY0 (OU) ; 

The conditional expression, IF NOT MV, tests the MAC overflow bit. If the condition is not 
true, a NOP is executed. The expression MR=MR+MX0*MY0 is the multiply/accumulate 
operation: the multiplier result register (MR) gets the value of itself plus the product of the X 
and Y input registers selected. The modifier in parentheses (UU) treats the operands as unsigned. 
There can be only one such modifier selected from the available set. (SS) means both are signed, 
while (US) and (SU) mean that e ther the first or second operand is signed; i RND) means to 
round the (implicitly signed) result. 

Special Functions: Accumulator saturation is the only MAC special function. 

IF MV SAT MR; 

The instruction tests the MAC overflow bit (MV) and saturates the MR register (for only one 
: is set. 



Table 2-5 is a list of all MAC instructions. 



[IF condil 



[IF condition] 



[IF condition] 



MB 
MF 



MR 
MF 



MR 
MF 



[IF condition] I MR 
ME' 



xop * yop ( 



SS 
SU 
US 
UU 
RND 



) ; 



= MR + xop * yop ( 
= MR - xop * yop ( 



SS 
SU 
US 
UU 
RND 

SS 
SU 
US 
UU 



) ; 



); 
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[IF condition] 
IF MV SAT MR; 

Table 2-5: MAC Instructions - 



MR 
MF 



MR [ ( RND ) ] : 



2.2.3 Shifter Group 

Standard Functions: Shifter standard functions include arithmetic and logical shifts, as well 
as floating point and block floating point scaling operations; derive exponent, norrralize, 
denormalize, and block exponent adjust. As an example, consider a Shifter instruction for 



normalize: 

IF NOT CE SR = SR OR NORM SI (HI) ; 

The conditional expression, IF NOT CE, tests the "not counter expired" condition. If the 
condition is false, a NOP is executed. The destination of all shifting operations is the S hifter 
Result register, SR. (The destination of the exponent detection instructions is SE or SB, as s hown 
below.) In this example, SI, the Shifter Input register, is the operand. The amount and dir;ction 
of the shift is controlled by the signed value in the SE register in all shift operations exc;pt an 
immediate shift. Positive values cause left shifts; negative values cause right shifts. 

The "SR OR" modifier (which is optional) logically ORs the result with the current contents 
of the SR register; this allows the user to construct a 32-bit value in SR from two 16-bit p ieces. 
"NORM" is the operator and "(HI)" is the modifier that determines whether the shift is relative 
to the HI orLO (16-bit) half of SR. If "SR OR" is omitted, the result is passed directly in:o SR. 

Special Functions: Shift-immediate is the only Shifter special function. The number of 
places (exponent) to shift is specified in the instruction word. 



Table 2-6 provides a list of all Shifter instructions. 



[IF condition] 


SR 




[SR OR] 


ASHIFT 


xop 


( 


HI 
LO 


) ; 


[IF condition] 


SR 




[SR OR] 


LSHIFT 


xop 


( 


HI 
LO 


); 


[IF condition] 


SR 




[SR OR] 


NORM 


xop 


( 


HI 
LO 


); 


[IF condition] 


SE 


= 


EXP 




xop 


( 


HI 
LO 
HIX 


); 


[IF condition] 


SE 




EXPDJ 




xop 









SR 



[SR OR] ASHIFT xop BY <data> ( I HI I ) ; 

I L0 I 

[SR OR] LSHIFT xop BY <data> ( I HI I ) ; 

LO 



Table 2-6: Shifter Instructions - 
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2.3 DATA MOVE INSTRUCTIONS 

These instructions move data to and from data registers and external memory. ADSP-2101 
registers are divided into two groups, referred to as reg which includes almost all registers and 
dreg or data registers, which is a subset. Only the program counter (PC) and the ALU and MAC 
feedback registers (AF and MF) are not accessible. Table 2-7 shows which registers belong to 
these groups. Many of the ADSP-2101 system control registers are memory-mapped. These 
are read and written as memory locations instead of with register names. 







Accessible Registers: reg 



SB 
PX 

10 - 17, MO - M7, L0 - L7 
CNTR 

ASTAT, MSTAT, SSTAT 
IMASK, ICNTL 
TXO, TX1,RX0,RX1 
IFC 



Data Registers: dreg 

AXO, AX1, AYO, AY1, AR 

MXO, MX1, MYO, MY1, MRO, MR1, MR2 

SI, SE, SRO, SRI 



Table 2-7: ADSP-2101 Register Se :: reg & dreg ■ 



— 







There are five classes of data move instruction. Except for immediate instructions, data 
addresses are computed by the DAGs via the contents of their index (I) and modify (M) registers. 
The data move classes are: 

• Load register immediate 

• Register-to-register move 

• Immediate address DM move 

• Indirect address DM or PM move 

• Multifunction DM and PM read 

In the description of each these classes below, "immediate value" refers to a 16-bit number 
contained in the instruction field while "immediate address" refers to a 14-bit address contained 
in the instruction field. 

Load Register Immediate: In this instruction, the data is provided by a 1 5-bit immediate 
value in the instruction-word and is moved into reg. 

reg = <data>; <data> = immediate value. 

Reg'lSter-tO-RegiSter Move: Here the value of a permissible reg is moved into another 
permissible reg. 



reg = reg; 
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Immediate Address Data Memory Move: Here data memory addressed by a 14-bit 
immediate address in the instruction-word is moved into a reg. Since no registers are tied up 
in generating addresses, any accessible reg can be the operand. 

Data Memory Read 

reg = DM(<addr>) ; <addr> = immediate address. 

Data Memory Write 
DM(<addr>) = reg; 

Indirect Address Memory Move: These instructions cause data to be moved betwee n dreg 
and either DM or PM. The memory address for the current operations is provided by one of 
the four I-registers (Im) of a DAG. The value of I is then stored back again after being modified 
by the contents of one of four M (Mn) and the buffer length stored in one of four L registers 
(Lm). The operation is: 

Read/write at address specified by Pointer: 

Memory address = Im 
Then Modify pointer: 

Im = (Im + Mn) mod Lm 

Indirect addressing may access either data-memory or program-memory. The register set in 
DAG1 can only be used for data-memory while register set in DAG2 can be used for either 
data- or program-memory. The buffer length registers Lm are paired with the index registers 
Im, i.e., L5 is the modulus register for the memory location indexed at 15. 

Memory Read into Data Register 
dreg = DM(Im,Mn) ; 
dreg = PM(Im,Mn) ; 

DM Write: Immediate or from Data Register 
DM (Im, Mn) = | dreg | 
| <data> | 

PM Write from Data Register 
PM(Im,Mn) = dreg; 

Multifunction Data- and Program- Memory Read: Here a combined Move instruction 

reads data into a pair of ALU or MAC input registers from both DM and PM in the same cycle. 
It is described in more detail in the next section. 

Table 2-8 gives a list of all data Move instructions. 



reg = reg; 

reg = DM (<address>) ; 
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= 



DM ( 



DM ( 



DM (<address>) 
reg 



10 




MO 


11 




Ml 


12 


i 


M2 


13 


r 


M3 


14 




M4 


15 




M5 


16 


r 


M6 


17 


i 


M7 



) = 



reg; 
<data>; 
PM ( 



10 




MO 


11 




Ml 


12 




M2 


13 


/ 


M3 


14 


1 


M4 


15 


1 


M5 


16 


1 


M6 


17 




M7 



dreg 
<data> 



( 


14 




N4 




15 


/ 


M5 




16 


r 


M6 




17 


r 


M7 



14 




M4 


15 




M5 


16 




M6 


17 


• 


M7 




dre 


r. 



) ; 



Table 2-8: Data Move Instructions 



2.4 MULTIFUNCTION INSTRUCTIONS 

Multifunction operations exploit he inherent parallelism of the ADSP-2101 architecture by 
providing combinations of data moves, memory reads and memory writes and computation in 
a single-cycle. 

2.4. 1 ALU/MAC with Data and Program Memory Read 

Perhaps the most common single operation in DSP algorithms is the sum of products, like the 
following: 

• Fetch two operands (such as a coefficient and a data point) 

• Multiply them and sum the result with previous products 

The ADSP-2101 can execute bcth data fetches and the multiplication/accumulation in a 
single-cycle. Typically, a loop of multiply/accumulates can be expressed in AE | SP-2101 source 
code in just two program lines. Since the on-chip program memory is fast enough to provide 
an operand and the next instruction in a single cycle, loops of this type can execu e with sustained 
single-cycle throughput. An examole of such an instruction is: 
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MR=MR+MX0*MY0(SS), MX0=DM(I0,M0), MY0=PM(I4,M5); 

The first clause of this instruction (up to the first comma) says that MR, the MAC result 
register, gets the sum of its previous value plus the product of the (current) X and Y input 
registers of the MAC (MXO and MYO) both treated as signed (SS). Note the simple assigiment 
statement form of the source code. 

In the second and third clauses of this multifunction instruction, two new operands are 
fetched. One is fetched from the data memory (DM) pointed to by index register zero (10, post 
modified by the value in MO) and the other is fetched from the program memory location (PM) 
pointed to by 14 (post-modified by M5 in this instance). Note that indirect memory addr;ssing 
uses a syntax similar to array indexing, with DAG registers providing the index values. Any I 
register may be paired with any M register within the same DAG. 

As discussed in Chapter 1 , registers are read at the beginning of the cycle and wri ten at 
the end of the cycle. The operands present in the MXO and MYO registers at the beginning of 
the instruction cycle are multiplied and added to the MAC result register, MR. The new operands 
fetched at the end of this same instruction overwrite the old operands after the multiplication 
has taken place and are available for computation on the following cycle. The user may, of 
course, load any data registers in conjunction with the computation, not just the MAC registers 
with a MAC operation as in our example. 

The computational part of this multifunction instruction may be any unconditional ALU 
instruction except division or any MAC instruction except saturation. Certain other restrictions 
apply: the next X operand must be loaded into MXO from data memory and the new Y operand 
must be loaded into MYO from program memory (internal and external memory are identical 
at the level of the instruction set). The result of the computation must go to the result register 
(MR or AR) not to the feedback register (MF or AF). 

2.4.2 Data and Program Memory Read 

This instruction is a special case of the instruction above, in which the computation is left out. 
It is also discussed in Section 2.3 as a multifunction data move instruction. It executes only the 
dual fetch as shown below. 

AXO = DM(I2,M0), AY0=PM(IM,M6) ; 

In this example, we have used the ALU input registers as the destination. As with the 
previous multifunction instruction, X operands must come from data memory and Y operands 
from program memory (internal or external memory in either case). 

2.4.3 Computation With Memory Read 

If a single memory read is performed, instead of the dual memory read of the previoi s two 
multifunction instructions, a wider range of computations can be executed. The legal compu- 
tations include all ALU operations except division, all MAC operations and all Shi fter operations 
except SHIFT IMMEDIATE. Computation must be unconditional. 

An example of this instruction is: 
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AR=AX0+AY0 , AX0=D!4 (10, M3) ; 

Here an addition is performed in the ALU while a single operand is fetched from data memory. 
The restrictions are similar to those for previous multifunction instructions. The value of AXO, 
used as a source for the computation, is the value at the beginning of the cycle. The data read 
operation loads a new value into AXO by the end of the cycle. For this same reason, the destination 
register (AR in the example above) cannot be the destination for the memory read. If that were 
legal, there would be a conflict. 

2.4.4 Computation With Memory Write 

The computation with the memoiy write instruction is similar in structure to the immediately 
preceding one: the order of the c auses in the instruction line, however, is reversed. First the 
memory write is performed; then the computation is performed as shown below: 

DM(I0,M0)=AR, AR=JiX0+AY0; 

Again, the value of the souice register for the memory write (AR in the example) is the 
value at the beginning of the instruction. The computation loads a new value into the same 
register; this is the value in AR at the end of this instruction. Reversing the order of the clauses 
of the instruction is illegal and in\ okes an assembler warning; it would imply that the result of 
the computation is written to memory when, in fact, the previous value of the register is what 
is written. There is no requirement that the same register be used in this way although this will 
usually be the case in order to pip;line operands to the computation. 

The restrictions on computation operations are identical to those above. All ALU oper- 
ations except division, all MAC operations and all Shifter operations except SHIFT IMME- 
DIATE are legal. Computation must be unconditional. 

2.4.5 Computation With Data Register Move 

This final multifunction instruction performs a data register to data register nove in parallel 
with a computation. Most of the restrictions applying to the previous two instructions apply to 
this instruction. 

AR=AX0+AY0, AX0=ME2; 

Here an ALU addition opera :ion occurs while a new value is loaded into AXO from MR2. 
As before, the value of AXO at the beginning of the instruction is the value used in the com- 
putation. The move may be from or to all ALU, MAC and Shifter input and output registers 
except the feedback registers (AF and MF) and SB. 

In the example, the data reg ster move loads the AXO register with the new value at the 
end of the cycle. All ALU operations except division, all MAC operations and all Shifter 
operations except SHIFT IMMEDIATE are legal. Computation must be unconditional. 

A complete list of multifunction instructions appears in Table 2-9. 
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<ALU 1 > 
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= DM ( 
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MO 
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AYO 


= PM ( 


14 
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M4 
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<MAC> 




AX1 
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Ml 




AY1 




15 
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M5 








MXO 
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M2 




MYO 




16 




M6 








MX1 
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M3 




MY1 
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MO 
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= PM ( 


14 




M4 
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AX1 




11 




Ml 
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15 




M5 




MXO 




12 




M2 




MYO 




16 




M6 




MX1 




13 




M3 




MY1 




17 




M7 





<ALU> 
<MAC> 
<SHIPT'> 



DM ( 



PM ( 



<ALO> 
<MAC> 
<SHIFT> 



, dreg 



DM ( 



10 


r 


MO 


11 


r 


Ml 


12 




M2 


13 




M3 


14 




M4 


15 




M5 


16 




M6 


17 




M7 


14 




M4 


15 




M5 


16 




M6 


17 




M7 



PM ( 



10 




MO 


11 




Ml 


12 
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M2 


13 
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M3 


14 
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15 
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16 
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M6 


17 
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M7 


14 




M4 


15 
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M5 


16 


/ 


M6 


17 


, 


M7 



dreg, 



<ALU> 
<MAC> 
<SHIFT> 



, dreg = dreg; 



Table 2-9: Multifunction Instructions- 



2.5 PROGRAM FLOW CONTROL INSTRUCTIONS 

Program flow control on the ADSP-2101 is simple yet powerful. It direct;; the program 
sequencer. In a normal order, the sequencer automatically fetches the next contiguous instr action 
for execution. This flow can be altered by these instructions. Program flow control provides: 

• Jumps to interrupt service routines and calls to subroutines 

• Return from interrupts and subroutines 



1 AH computation is unconditional; ALU Division and Shift Immediate operations prohibited. 



46 



ADSP-2 10 1 Instruction Set Overview Chap. 2 



• FLAG pin conditions 

• DO loops 

• IDLE instruction 

An optional "IF" condition can first test any of the status conditions defined earlier. 

JUMP and CALL Instructions: JUMP is a familiar construct from many other processors. 
As an example consider the following statement: 

IF EQ JUMP my_label; 

My Label is any identifier used as a new address where the program control is transferred for 
execution. Instead of the label, an index register in DAG2 may be explicitly used. The default 
scope for any label is the module in which it is declared. The Assembler directive .ENTRY 
makes a label "visible" as an entry point for routines outside the module. Conversely, the 
.EXTERNAL directive makes it possible to use a label declared in another module. On the 
other hand, a CALL instruction brings in subroutines, and pushes the present PC onto its stack 
as the return address. 

RETURN instructionS'.ThQK a re two return statements: RTS from subroutine, and RTI from 
an interrupt service routine. In either case, a RETURN pops the return address from the PC 
stack. A return from an interrupt atso pops the status stack back, returning arithmetic and mode 
status and interrupt mask registers to the values they had prior to the interrupt. 

FLAG Instructions: JUMP and CALL permit the additional conditionals ' FLAG_IN" and 
"NOT FLAG_IN" to be used for branching on the state of the FI pin, but only with direct 
addressing, not with DAG2 as the address source. Additionally, FO pin, (Flag Out) can be set, 
cleared or toggled. Although this instruction does not alter the flow of the progjam, it provides 
a control structure for multiprocessor communication and is therefore included in this group. 

Loop InstructionJhe DO UNTIL instruction performs the zero-overhead looping operation. 



DO <addr> [UNTIL condition]; 

The label <addr> designates the last instruction in the loop. The "condition' determines the 
termination of the loop. When the DO is entered, <addr> and the termination condition are 
pushed onto the Sequencer loop stack, and the current PC+1 is pushed onto the PC stack to 
become the next address after loop termination. The looping hardware automatically checks 
the condition code whenever execution passes through the address-value <adcr>. 

IDLE Instruction: This instruction provides a way to wait for interrupts. IDLE causes the 
processor to wait in a low-power state until an interrupt occurs. It uses less power than loops 
created with JUMP. 

Table 2-10 provides a complete list of program flow control instructions. 





[IF condition] JUMP 



(14) 
(15) 
(16) 
(17) 
<address> 
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[IF condition] 



IF 



IF 



FLAG_IN 
NOT FLAG IN 



FLAG_IN 
NOT FLAG IN 



[IF condition] 
[IF condition] 
DO <address> 
IDLE ; 



CALL 

CALL 

JUMP 

RTS ; 
RTI ; 
[UNTIL termination] 



(14) 
(15) 
(16) 
(17) 
<address> 

<address> 



Table 2-10: Program Flow Control Instructions - 



2.6 MISCELLANEOUS INSTRUCTIONS 

There are several miscellaneous instructions. NOP, of course, is a no operation instruction. The 
PUSH/POP instruction allows explicit control of the status, counter, PC and loop stacks; inte rrupt 
servicing automatically pushes and pops some of these stacks. 

The Mode Control (enable/disable) instructions turn on and off several modes of operation. 
The instruction governs modes common to the ADSP-2100 (bit-reversal on DAG1, latching 
ALU overflow, saturating the ALU result register, choosing the primary or shadow register set) 
and the ADSP-2101 extended mode controls (GO mode for continued operation during Bus 
Grant, multiplier shift mode for fractional or integer arithmetic and timer enabling). 

A single ENA or DIS can be followed by any number of mode identifiers, separated by 
commas; ENA and DIS can also be repeated. All seven modes can be enabled, disabbd or 
changed in a single instruction. 

The MODIFY instruction modifies the address pointer in the I register sele cted wilh the 
value in the selected M register, without performing any actual memory access. As always, the 
I and M registers must be from the same DAG; any of 10-13 may be used only v/ith one from 
M0-M3 and the same for 14-17 and M4-M7. If circular buffering is in use, modulus logic applies 
(See Chapter 3, "Data Moves," for more information). 

Table 2-11 gives a complete list of miscellaneous instructions. 



[IF condition] 



SET 

RESET 

TOGGLE 



FLAG OUT 



NOP; 



4S 
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PUSH 
POP 



STS [, POP CNTR] [, POP PC] [, POP LOOP] 



ENA 
DIS 



MODIFY ( 



BIT 


REV 


t 


/ • - 


AV LATCH 




AR SAT 




SEC 


REG 




TIMER 




G_MODE 




10 


/ 


MO 


11 




Ml 


12 


/ 


M2 


13 


/ 


M3 



) ; 



14 , M4 

15 , M5 

16 , M6 

17 , M7 

Table 2-1 1 : Miscellaneous Instructions 



2.7 DATA STRUCTURES 

The ADSP-2101 Cross-Software supports the declaration and use of a simple set of data 
structures: one-dimensional arrays and ports. The array may be a single value Or multiple values. 
In addition, the array may be used as a circular buffer. Here is a brief discussion of each instance 
with an example of how they are declared and used. Complete syntax for these and other 
directives is given in the ADSP-2101 Cross-Software Manual [2]. 

2.7.1 Arrays 

Arrays are the basic data structures in the ADSP-2101 instruction set. In ADSP-2101 literature, 
the words "array" and the expression "data buffer" are used interchangeably. Arrays are declared 
with Assembler directives and can be referenced indirectly and by name, can bi initialized from 
immediate values in a directive or from external data files and can be linear or circular with 
automatic wraparound. Assembler Directives are described in detail in Chaprer 3. 

An array is declared with a directive such as 

. VAR/DM coefficients [ 128 ] ; 

This declares an array of 128 16-b t values located in data memory (DM). The special operators 
A and % reference the address and length, respectively, of the array. It could be referenced as 
shown below. 

10 = ''coefficients; {point to address of buffer} 

MX0=DM(I0,M0) ; {load MXO from buffer) 

These instructions load a value in o MXO from the beginning of the coefficients buffer in data 
memory. With the automatic post-modify of the DAGs, the user could execi te the second of 
these instructions in a loop and continuously advance through the buffer. 
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Alternatively, when only the first location needs to addressed, one can directly us; the 
buffer name as a label in many circumstances, such as 

MX0=DM (coefficients) ; 

The Linker substitutes the actual address for the label. It is also possible to initialize a complete 
array/buffer from a data file, using the INIT directive. 

. INIT coefficients : <f ilename . dat>; 

This reads the values from the file filename.dat into the array at link time. This feature is 
supported only in the ADSP-210X Simulators even though data cannot be loaded directly into 
on-chip data memory by the hardware booting sequence. 

An array or data buffer with a length of one behaves as a simple single-word variable. 

2. 7.2 Circular Arrays/Buffers 

A common requirement in DSP is the circular buffer. This is directly implemented by the 
ADSP-2101 DAGs, using the L (length) registers. First, the buffer must be declared as circular: 

.VAR/DM/CIRC coefficients [ 128 ] ; 

This identifies it to the Linker for placement on the proper address boundary. Next, the L register 
must be initialized, typically using the % operator (or a constant) and, in the example be low, 
the I register and M register. 

LO = %coefficients; {length of circular buffer) 

10 = A coef ficients; {point to address of buffer} 

MO = 1; {increment by 1 location each time) 

Now a statement such as 

MX0=DM(I0,M0) ; {load MXO from buffer) 

in a loop, cycles continuously through coefficients and wraps around automatically. L registers 
should be initialized to zero for buffers of any length that are not circular. 

2.7.3 Ports & Memory-Mapping 

The .PORT directive in the System Builder module allows the user to refer to a spec tfic hard ware 
address with an identifier of his choosing as shown here. This capability makes it easy to inteiface 
to memory-mapped peripherals, such as converters. 

.PORT/ABS= H#800 converter_in; 

After declaring the same identifier in the Assembler, a value can be read directly Tom 
the port with a statement such as 

SI = DM(converter_in) ; 

This loads the SI register with the value present at the address specified in the System Builder. 
(The Linker reads the Architecture Description file produced by the System Builder to obtain 
the actual address for the label.) The user can change the hardware address of the port without 
having to rewrite the entire program. 
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2.8 SUMMARY 




In this chapter, we reviewed the very rich instruction set through which the aser gains access 
to many unusual features of the ADSP-2101 microcomputer. Some of these features are: 

• Two addressing modes; 

• Multifunction instructions: arithmetic and data moves in one cycle; 

• Single-cycle context save:: swaps the full register set; 

• Zero-overhead looping; 

• Parallel access to both data program operands. 

The ADSP-2101 instruction set discussed in this chapter is a superset or the ADSP-2100 
instruction set. It is source and ob ject code compatible with the ADSP-2100. An ADSP-2100 
program may need to be relocated! to utilize internal memory and conform to the ADSP-2101 's 
interrupt vector and reset vector placement. 

A complete program example using this instruction set requires knowledge of the software 





chapter 3 

Overview of Development Tools 



3. 1 1NTRODUCTION 

The ADSP-2101 is a compact yet very efficient VLSI circuit containing 68 pins. As described 
in chapter I, these pins provide terminals for power, data and program memory, input/output, 
as well as a number of other connections for interfacing with the outside world. There are 
several important issues that must be considered before the ADSP-2101 can be successfully 
interfaced in areal application. First, the chip itself must be placed in a proper system architecture 
(also known as the target hardware configuration) that supports it. This architec ture cor tains 
memory resources and input/output devices. Second, the chip must be correctly programmed 
in its native assembly language so that it can perform its intended function. In almost all 
applications this source program code is not known in advance and therefore must be developed 
from scratch with several iterations of debugging. Third, the chip and its object code must be 
tested in an actual environment with real input/outputs with interrupts and real memory resources 
driven at the true speed of the chip. Finally, the object code must be burned in a PRO? 4 (or 
EPROM) and the chip along with the PROM must be integrated with the target system. This 
activity is called system development and is the main hidden cost of digital signal processing. 
It is carried out by the development tools provided by the manufacturer of the chip. 

Although this system development is needed only initially when the application is being 
designed and not in the final product, it must be stressed that a student or an engineer will nost 
likely be working with development tools. Therefore a complete understanding of the c apa- 
bilities of these tools is as essential as the architecture of the chip itself. In this, chapter, we 
briefly describe development tools that support the ADSP-2101 microcomputer. For deuiled 
descriptions of these tools, appropriate manuals [2 - 4] are recommended. 

The development process begins with the task of defining the target system hardware 
environment. To define the hardware environment, the System Builder is used. The Sy;tem 
Specification file includes the target hardware information. The System Builder reads this file 
and creates an Architecture Description file which passes information about the target hardware 
to the Linker, Simulator, and Emulator. 
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Code generation begins by creating assembly source code modules. An assembly module 
is a unit of source code such as a calling program, subroutine, data buffer declaration section 
or any combination. Each assembly code module is assembled separately by the Assembler. 
Several modules are then linked together to form an executable program. 

The Linker needs the target hardware information located in the Architecture Description 
file to determine placement of the code and data fragments. In the assembly modules we have 
the option to specify each code/dita fragment as completely relocatable, relocatable within a 
defined memory segment, or plac;d at an absolute address. Absolute code or lata modules are 
placed at the specified base address, provided the specified memory area has the correct 



Using the Architecture Description file and the Assembler output files, the Linker 
determines the placement of relocatable code and data segments (including circular buffers), 
and places all segments in memory locations with the correct attributes (CODE or DATA, RAM 
or ROM). The Linker generates an executable image file, which may be loaded i nto the Simulator 
and Emulator for debugging. 

The Simulator provides windows that display different aspects of the hardware envi- 
ronment. To replicate the target hardware environment, the Simulator configures its memory 
according to the System Builder output, and simulates I/O ports according to user-entered 
Simulator commands. This simulation provides capabilities to debug the system and analyze 
performance before committing to a hardware prototype. 

After debugging with the Simulator, the Emulator is used in the protot>pe target system 
to debug hardware, timing, and real-time software problems. It provides overlay memory to 




The PROM Splitter translates the executable memory image file (Linker output) into a 
file that is compatible with a PROM burner. Once the ADSP-2101 code is burnt into PROM 
and an ADSP-2101 is plugged into the target board, the prototype is ready to run. 

Figure 3-1 shows a flow chart of the ADSP-2101 development cycle. All the above steps 
in the development process except emulation are carried out by the software dev elopment system 
while the hardware development consists of the Emulator and the prototype tirget system. 




In the remainder of this cha] >ter we explain each of these tools . The soft w are development 
tools are described in Section 3.2. In Section 3.3 we present hardware development tools. 
Finally in Section 3.4, some issues related to the host computer are discussed. 



The software development system of the ADSP-2101 is called the Cross-Sof:ware system. It 
is a set of modules. The System Eiuilder module provides a high level methoc for defining the 
architecture of systems under development. The Assembler module produces object code and 
the Linker module combines objec t codes and library calls into an executable fil The Simulator 
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module provides an interactive instruction-level simulation with a reconfigurable user inte face. 
A PROM Splitter generates PROM burner compatible files. In addition, the C Compiler i 5 also 
available (but not discussed in this book) which generates ADSP-2101 assembly source code. 



Define Target 
Hardware 
(System Builder) 



SIMULATE 



^ START ~ 



Assemble 
Module 1 



Asse mble 
lie 2 



Assemble 
uleN 



LINK 



PROM SPLIT 

— r - 
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■ ■> 



C Language 
Modules 



Repeat As Necessary 




Burn PROMS 

— r~ 



Repeat As Necessary 



Prototype Test 




Figure 3-1 : ADSP-2101 System Development Flow 
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3.2. 1 The System Builder 

The system builder module of the Cross-Software system is a software tool for describing the 
configuration of the target system's memory and I/O to the rest of the development system. The 
available absolute memory-address ranges are listed for program memory (PM) and data 
memory (DM) along with the intended attributes. The DM is usually RAM while the PM can 
be assigned to ROM for permanent programs or constants as well as to RAM blocks for coef- 
ficients or programs which may change. The memory-mapped addresses of I/O devices are 
also specified in this step. 

The amount of memory and I/O ports a target system may include is shown in Table 3-1. 



urn Available 



Data Memory 
(16-bit data, ROM or RAM) 


Up to 1 5 K words 
(IK on-chip, up tp 14K off-chip, IK reserved) 


Program memory 
(24-bit code or data, ROM or RAM 


Up to 1 6K words, mixed code & data 
(2K on-chip, up to 14K off-chip) 


Boot Memory 
(24-bit code or data, 
padded to 32-bit word width ) 


Up to 64K bytes, configured as 16K words 
( 1 to 8 pages, each containing 2K words) 


Memory-Mapped I/O 


Any number, up to memory limits 
(Simulator limited by host file system limits) 



Table 3-1: ADSP-2101 



Configurations - 



The user specifies the hardware configuration in a System Specificaticn source (.SYS) 
file using System Builder directives. The System Builder processes the .SYS file and generates 
the Architecture Description file (. \CH). The Architecture Description file is used by the Linker 
to place relocatable segments in memory, by the Simulator to simulate memory configurations, 
and by the Emulator to set up target system memory mapping. The System Builder outputs error 
messages, if any, or a summary of the architecture created to the screen. The s> stem builder I/O 
is shown in Figure 3-2. 

The System Builder is invoked by typing 

BLD21 filename [ .ext] [-switch] 

where filename .ext is the system specification source file. The filename extension is 
optional and defaults to .SYS. There is one switch for invoking the System Builder. The -c 
switch makes the System Builder case-sensitive. This is provided primarily for compatibility 
with the C Compiler, which is always case-sensitive. 

In a System Specification file, symbolic names are assigned to the system configuration 
itself, I/O ports, and memory segments. The memory segment names may be used in the 
Assembler; memory segment names and memory characteristics are used by the Linker. 
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(Use Operating System / Pipes To 
Capture Screen Output) 



Figure 3-2: System Builder I/O 

All symbolic names must be unique. A symbolic name is a string of letters, digits, and 
underscores with a letter as the first character. Symbol names can be of any length. On y 32 
characters are significant. 

System Builder keywords cannot be used as symbolic names. Table 3-2 lists the System 
Builder keywords. 

Assembler keywords, listed in Table 3-4, may not be used as symbolic names either. The 
System Builder accepts such symbol definitions without flagging an error. However, the Linker 
does not. 
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ABS CODE 

ADSP2100 CONST 

ADSP2101 DATA 

BOOT DM 

Table 3-2: System Builder Keywords ■ 



ENDSYS 
MMAPO 
MMAP1 
PM 



PORT 
RAM 
ROM 
SEG 



SYSTEM 



Listing 3-1 is an example of a system specification file for an ADSP-2101 system. Note 
that comment fields are enclosed within braces, ( ) , and can be inserted any w lere in the file. 



. SYSTEM fir_system; {system nam 2} 

.ADSP2101; {ADSP-2101 system} 

.MMAPO; {boot loadiig enable) 

. SEG/ROM/BOOT=0 boot_miim[2048] ; {boot page 3ne) 

. SEG/PM/RAM/ABS=0/CODE/DATA int_pm[2048] ; {on-chip program mem) 

.SEG/PM/RAM/ABS=2048/CODE/DATA ext_pm[14336] ; {external program mem) 

. SEG/DM/RAM/ABS=0/DATA ext_dm[14336] ; {external data mem) 

.SEG/DM/RAM/ABS=14336/1)ATA int_dm[1024] ; {on-chip data mem) 
.ENDSYS; 

Listing 3-1 : Sample System Specification File 



The System Specification Source file for the ADSP-2101 specifies the amount of data, 
program, and boot memory included in the target system. The first directive in the file is the 
.SYSTEM directive. This directive assigns a name fir system to the hardware description and 
signals the start of the file. 

The .ADSP2101 statement identifies the processor type, here naming the ADSP-2101 
microcomputer. This statement is required. The presence of the .MMAP directive or the dec- 
laration of boot memory also serves to signal the Cross-Software that the system in question is 
an ADSP-2101 architecture. If none of these indicators are present, the System Builder assumes 
an ADSP-2100 processor. 

The .MMAPO directive specifies the simulated state of the MMAP pin on the ADSP-2101 
in this example system. Defining MMAP as indicates that boot memory is to be loaded into 
the chip's internal program memory space, beginning at address 0. 

The .SEG directive declare > the system's physical memory segments and their charac- 
teristics. In this example, the segments declared comprise the full on-chip and off-chip program 
and data memory configuration of the ADSP-2 1 1 . Many applications, however, do not require 
this much memory space. 

Boot mem identifies a 2K-word space for one page of external boot memory. 

Int _pm declares the 2K-word on-chip program memory space beginning at address 0. In 
the ADSP-2101 this memory can always hold both code and data and should be explicitly 
declared as such as in this example. Ext _pm declares a 14K-word space for external program 
code and data storage beginning f t address 2048, after the on-chip memory. 
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Ext dm declares a 1 4K-word space for external data storage beginning at address 0. In:_dm 
declares the IK- word internal data memory space beginning at address 14336. This correspands 
exactly to the on-chip data memory of the ADSP-2101 which is available for general system 
use. The IK of on-chip memory above this is reserved for processor use and should not be 
declared. The memory segments can be declared in any order. 

The last statement in a system specification file is the .ENDSYS directive. The System 
Builder stops processing when it encounters the .ENDSYS directive. 

A complete description of each System Builder directive is given in Chapter 4. In summary, 
the System Builder: 



The ADSP-2101 Assembler translates source code modules into object code modules. The user 
creates a source code file (.DSP) using the ADSP-2 101 assembly language and define variables, 
data buffers, and symbolic constants using assembler directives. Separately assembled moc ules 
are linked together to form an executable program. 

Figure 3-3 shows the Assembler input and output files. The ADSP-2101 Assembler r:ads 
the source code file (.DSP) and generates four output files with the same root name: an object 
file (.OBJ), a code file (.CDE), an initialization file (.INT), and a list file (.LSTi. The object 
file, code file and initialization files are passed to the Linker. The object file contains informa tion 
on memory allocation and symbol declarations. The code file contains instruction opcodes with 
unresolved symbols marked. The initialization file contains initialization information for data 
buffers. The list file, which is optional, is for documentation. 

Using assembly directives in the source code file, the user can include other source code 
files and inform the Linker of initialization data files in the assembly process. The Assembler 
reads these files and processes them together with the original source file. There are two pre- 
processors of the Assembler, an ANSI-standard C language module and a standard preprocessor. 
The Assembler also supports a macro capability. 

Assembler Modules 

The Assembler consists of three modules: 

C language preprocessor actual filename: ASMPP 

standard preprocessor actual filename: ASM21 

core assembler actual filename: ASM2 

Different combinations of the modules can be run using the Assembler switches detailed 
below. Invocation of the Assembler with no switches runs the standard preprocessor and core 
assembler only. 



• allows the user to specify target hardware, 

• uses high level constructs, and 

• flags inconsistencies between hardware and software. 



3.2.2 The Assembler 
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Figure 3-3: Assembler I/O 

Running The Assembler 

The Assembler is invoked from the host system by entering: 
ASM21 filename[ . ext] [-switch ...] 



Sec. 3.2 SOFTWARE DEVELOPMENT TOOLS 



59 



Filename [.ext] is the source code file. The filename extension is optional and defaults to 
.DSP. Other data and source code files are included in the assembly process using the direci ives 
JNIT and .INCLUDE (described later in this chapter). 

The Assembler switches are not case-sensitive, and multiple switches must be sepai ated 
by spaces. The Assembler switches are listed below in Table 3-3; some require arguments as 
shown. This list can be displayed on the monitor by invoking the Assembler with no filer ame 
or switches: ASM21. 



Switch 


Result 


-cp 


Runs C language preprocessor 


-P 


Runs standard preprocessor without core 




assembler 


-dvariable[=va\ue] 


Define variable for C preprocessor 


-1 


Creates .LST file 


-m [number] 


Macros expanded in .LST file, to depth of 




[number] 


-i [number] 


INCLUDE files expanded in .LST file, to 




depth of [number] 


-s 


No semantics checking 


-c 


Makes the Assembler case-sensitive 



Table 3-3: Assembler Switches 



A complete description of these switches is given in ADSP-2101 Cross-Software Manual 

m. 



Program Structure 

The basic unit of an ADSP-2101 program is the module. Modules are defined by: 
.MODULE [/qualifiers] module_name; 

STATEMENT; (may be any of • [label:] instruction 

... • directive 

• macro invocation) 

.ENDMOD; 

Each element of the source module must end with a semicolon. Statements can be either 
an instruction, assembler directive, or macro call. Giving an instruction a label is optional. The 
.MODULE and .ENDMOD directives are defined in the section "Assembler Directives." Symbol 
names in the source code module must be unique. Assembler-reserved symbols may not be used 
as identifiers. Because the Assembler is not case sensitive, both upper and lower case keywords 
are reserved. Table 3-4 lists the assembler keywords. Some of those listed correspond to 
ADSP-2101 features which are not visible to users. Avoid them because their use may cause 
errors. Individual lines in each module must be no more than 200 characters in length. 
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ABS 

AC 

AF 

ALT_REG 

AND 

AR 

AR_SAT 

ASHIFT 

ASTAT 

AUX 

AV 

AV_LATCH 

AXO 

AX1 

AYO 

AY1 

BIT_REV 

BM 

BY 

C 

CACHE 
CALL 
CE 
CIRC 
CLR 
CLEAR 
CNTR 
CONST 
DIS 
DIVS 
DIVQ 



DM 
DO 

EMODE 
ENA 

ENDMACRO 

ENDMOD 

ENTRY 

EQ 

EXP 

EXPADJ 

EXTERNAL 

FOREVER 

FLAG_IN 

FLAG_OUT 

GE 

GLOBAL 
GT 




INCLUDE 

INIT 

JUMP 

L0 

LI 

L2 

L3 

L4 

L5 

L6 

L7 

LE 

LOCAL 

LOOP 

LSHIFT 

LT 

MO 

Ml 

M2 

M3 

M4 

M5 

M6 

M7 

MACRO 
MF 

M_MODE 

GO_MODE 

MODIFY 

MODULE 

MR 



MRO 

MR1 

MR2 

MSTAT 

MV 

MXO 

MX1 

MYO 

MY1 

NAME 

NE 

NEG 

NEWPAGE 

NOP 

NORM 

NOT 

OR 

PASS 

PC 

PM 

POP 

PORT 

POS 

PRI 

PUSH 

RAM 

REGBANK 

RESET 

RND 

ROM 

RTI 




SU 
TEST 
TIMER 
TOGGL 

TOPOFPCSTAC 

TRA 

TRUE 

TX1 

TXO 

UNTIL 

US 

UU 

VAR 

XOR 



Table 3-4: Asembler-Reserved Symbolss/Keywords 

Assembler Directives 

Assembler directives are instructions that control the assembly process. They do not produce 
opcodes. In the source file, an assembler directive statement starts with a period and ends with 
a semicolon. An assembler directive may take modifiers and arguments, as specified in each of 
the following sections. Some assembler directives which may take modifiers and arguments 
are briefly described below. 

.MODULE Directive 

The .MODULE directive defines ihe start of an assembly module and is the first statement. 
It has the form: 



.MODULE[/qualifier. . .] module name; 

Qualifiers consist of any of the following: 
RAM or ROM 
ABS = absolute start address 
BOOT = 0, 1,2, 3, 4, 5, 6, or 7 

SEG = memory segment name defined in System Builder 
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The module qualifiers determine the location of the module in memory. Memory typ>; can 
be specified as RAM or ROM, followed by the start address and/or a physical seg nent 
in memory defined in the System Builder. (The start address is a constant.) 
The example that follows defines the module mainroutine , which is located at 
execution-time in RAM at address (on-chip). The code is stored on boot page 0. 

.MODULE/RAM/ABS=0/BOOT=0 main_routine; 

The next example defines the module filter _routine, located in a memory segment ns.med 
fir (as defined in a System Builder output .SYS file), which is specified as ROM. 
. MODULE/ROM/ SEG=fir f ilter_routine ; 

.ENDMOD Directive 

The .ENDMOD directive is the last statement in a source file. The assembly process 
terminates when the Assembler reads the .ENDMOD directive. It has the form: 
.ENDMOD; 

.VAR Directive 

The .VAR directive declares data buffers. All buffers must be declared with this dire:tive 
prior to any use or reference to them. It has the form: 

.VAR[/qualifier. . .] bujfer_name[\ength],...; 

One .VAR directive can have an unlimited number of declarations, each separated by 
commas, up to the maximum number of characters that can be processed. Specific, ition 
of length is optional, with default to one (a single word variable). 

Qualifiers consist of any of the following: 
PM or DM 
RAM or ROM 
CIRC 

ABS = absolute address 

SEG = memory segment name defined in System Builder 
STATIC 

The following is an example of a variable declaration: 

.VAR/DM/RAM/ABS=0xl0F seed; 

This statement declares a one word variable called seed in data memory RAM, at hexa- 
decimal address 10F. 

The following is an example of one circular buffer of length five (three bits required to 
represent), which would be located by the Linker at an address that is a multiple of eight 
(has three LSBs equal to zero): 

.VAR/CIRC aa[5] ; 

.INIT Directive 

The .INIT directive initializes a declared variable or all or part of a data buffer (in either 
DM or PM). The buffer is initialized with the value(s) listed or those contained in an 
external buffer. This directive has the form: 
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.INIT buffer jiame: constant or expression,. . ., 

A other_buffer[offset] or %other_buffer [off set],. .., 
<file.name>\ 



Any combination of the three forms of initialization values shown above may be used, 
separated by commas. This directive recognizes the "pointer to" ( A ) and "length of (%) 
operators. 

In the following example, variable seed is initialized to a constant hex value. 

. INIT seed: 0x3FFF; 

In the second example, a variable lookup table is set to point to the base address of buffer 



. INIT lookup table: *sin; 
— 

.CONST Directive 

The .CONST directive declares symbolic constants. Symbolic constants can be used in 
place of numerical constanls. The .CONST directive has the form: 
.CONST const _name=comt or expression,...; 

One .CONST directive can have an unlimited number of assignment statements, each 
separated by commas, up to the maximum number of characters that can be processed. 
The following example defines two constants, equal to the numeric values shown. 




. CONST taps=15, taps_less_one=14 ; 

.PORT Directive 

The .PORT directive declares a memory-mapped I/O port in data or program memory. 
The argument for this directive is a symbolic port name. The name must be the name of 
a port declared in the Architecture Description file. The .PORT directive has the form: 
.PORT port jiame; 

The following example identifies the port adjsample which has been previously declared 
as a specific memory location in the System Builder: 

.PORT ad_sample; 

.INCLUDE Directive 

The .INCLUDE directive is used to include another source file in the file being assembled. 
The Assembler processes t ie included file as if it were part of the original source file. 
The .INCLUDE directive has the form: 
.INCLUDE <filename>; 

Source files specified by the .INCLUDE directive can have .INCLUDE statements within 
them (nesting of include files is limited only by memory). The .INCLUDE directive 
supports modular programming. For example, in many cases it is useful to develop a 
library of subroutines or macros which are shared between different programs. Rather 
than rewriting these routines for each program, you can incorporate a macro library into 
the source code file using the .INCLUDE directive. In the following example, file 
macro Jib is included while assembling. 

. INCLUDE <macro_lib>; 

Here the use of angle brackets is required. 



sin. 
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.EXTERNAL Directive 

The .EXTERNAL directive assigns the EXTERNAL attribute to variables, ports, and 
program memory labels declared in other assembly modules. Those symbols in other 
modules can only be referenced if they are assigned the external attribute in the referei icing 
module and the global or entry attribute in the module where they are actually declared. 
It has the form: 

.EXTERNAL externalsymbol,. . .; 
The following is an example. 

.EXTERNAL fir_start; {entry label in different module) 

.GLOBAL Directive 

The .GLOBAL directive assigns the GLOBAL attribute to variables, buffers and ports. 
Only such identifiers declared (with . VAR or .PORT) as global may be referenced in other 
modules. It has the form: 

.GLOBAL internal _symbol,...; 

A variable, buffer, or port that is declared within a module can be referenced only by that 
module unless explicitly specified as global. For program labels which are intended to be 
referenced in other modules, the .ENTRY directive rather than the .GLOBAL directive 
should be used. Example: 

.GLOBAL seed; 

Other modules are able to refer to global identifiers by declaring those symbcls as 
EXTERNAL. 

.ENTRY Directive 

The .ENTRY directive assigns the ENTRY attribute to program labels. This makes the 
label visible to other modules for use in subroutine calls or inter-module jumps. This 
directive has the form: 
.ENTRY program Jabel,...; 

In the following example, the label firjtart is visible outside the current module. 

. ENTRY fir_start; 

Program Example 

Listings 3-2 through 3-4 illustrate a sample source code program, an interrupt service subrou- 
tine, and an include file for the ADSP-2101. In this example the module mainjoutine is the 
main program and fir routine is the subroutine. These modules are linked together to foim a 
complete program. 

There are six possible interrupt sources for the processor plus the restart vector at address 
0. Each has four locations associated with it. As described in Chapter 1, the first 28 addi esses 
in program memory contain the restart and interrupt vectors (0x0000 - 0x001 B). The 29t l PM 
address (0x00 1 C) holds the first program instruction. Since mainj-outine is declared at abs olute 
address zero, the first 28 instructions are placed in the interrupt vector locations Becaus; this 
example uses only the restart (0x0000) vector and SPORT0 Receive (OxOOOC) interrupt, the 
remaining instructions are simply returns (RTI). 
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The .VAR directive defines two circular buffers in on-chip memory: the one in data 
memory RAM is used to hold a delay line of samples and the one in program memory RAM is 
used to store coefficients for the filter. Databuffer and coefficient are declared as GLOBAL 
buffers in the mainjoutine, while the firjoutine declares them as EXTERNAL. The address 
label, fir_start, is declared as ENTRY in firjoutine and can be referenced oy mainjoutine, 
which declares it as EXTERNAL . 



This sample program implements a FIR filter routine and has several feati ires worth noting . 
After declaring the include file and memory buffers and performing initialization, mainjoutine 
jumps to location restarter. Here the data and coefficient buffers are cleared and the data 
memory-mapped control register!; of the ADSP-2101 are set up. The function:; selected include 
SPORTO timing specification, (i-law companding, and 8-bit data words. SPORTO interrupt is 
then enabled and the processor loops on the IDLE instruction until the interrupt from SPORTO 
is received. The filter is thus inte rrupt-driven. When the interrupt occurs, the program control 
shifts to the subroutine by jumping to location firjtart. 



All further activity takes pi: 
from interrupt, execution resumi 



)lace in the interrupt service routine, Listing 3-3. 
les at the WAIT loop. 



After the return 



{ADSP-2101 FIR Filter program 
Serial port used for I/O 
Internally generated serial clock 

12.288 MHz clock rate gives 8000 Hz sampling rate} 



. MODULE/RAM/ ABS=0 main_routine; 
. INCLUDE <const.h>; 

. VAR/DM/RAM/CTRC data_buf f «sr [taps] ; 
.VAR/PM/RAM/CIRC coefficient [taps] ; 
. GLOBAL data_buffer, coefficient; 
.EXTERNAL fir_start; 
.INIT coefficient: <coef f . dat>; 



{ program loaded from HOOT EPROM, 
{ MMAP=0 } 
{data values) 



{ initialize coeffs from external 
{ file } 



Restarter: 



clear buffer: 



{code starts here} 
{load interrupt vector addresses} 

JUMP restart«sr; nop; nop; nop; {restart interrupt} 




nop ; nop ; 



RTI; nop; nop; nop; 
RTI; nop; nop; nop; 
JUMP fir_start; nop 
RTI ; nop 
RTI; nop; i 
RTI; nop; 

{initia: 



L0 = %data_buffer; 
L4 = %coef f icient; 
MO = 1; 
M4 = 1; 

10 = A data_buffer; 

14 = "coefficient; 

CNTR = %data_ buffer; 

DO clear_buf lifer UNTIL CE; 

DM(IO,MO)=0; 



{sampling interrupt IRQ2} 
{SPORTO transmit int} 
{ SPORTO receive Lnt } 
{SPORTl transmit int} 
{ SPORTl receive Lnt} 
{TIMER interrupt } 



{setup circular buffer length} 
{setup circular buffer length} 
{modify=l for increment} 
{through buffers \ 
{point to data si;art} 
{point to coeff ;3tart} 
{setup loop counter} 

{clear data buffur} 
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WAIT: 



. ENDMOD ; 



AX0=0xl000; 

DM (sys_cont_reg) =AX0 ; 

AX0=0 ; 

DM (dm_wait_reg) =AX0 ; 
AX0=0x6B27; 

DM ( sportO_cont_reg) =AX0 ; 
AX0=2 ; 

DM(sportO_sclkdiv) =AX0; 
AX0=255; 

DM (sportO_rf sdiv) =AX0 ; 

ICNTL = 0x07; 
I MASK = 0x0018; 

IDLE; 

JUMP WAIT; 



Listing 3-2: Main Routine Example - 



{SPORT0 enabled } 

{All DM wait states 0} 

{SportO control register} 

{Generate 2.048MHz serial elk} 

{Divide by 256 for 8 KHz rate} 

{Enable edge sensitive int} 
{enable SPORTO interrupt only} 

{wait for interrupt } 



.MODULE/RAM fir_routine; 

. INCLUDE <const . h>; 
. ENTRY fir_start; 

. EXTERNAL data buffer, coefficient; 



{relocatable interrupt service \ 
{ routine module} 
{include constant declarations} 
{make label visible outside } 
{ module} 

{make global buffers visible to 
{ module } 



{ code } 



FIR START: 



{N-l passes within DO locp} 
{read from SPORTO } 
{transfer data to buffer} 
MR=0, MY0=PM(I4,M4) , MX0=DM(I0,M0) ; 

{set up multiplier for lcop} 



CNTR = taps-1; 
SI = RX0; 
DM(I0,M0) = SI; 



convolution : 



DO convolution UNTIL CE; {CE = counter expired} 
MR=MR+MX0*MY0 (SS) , MY0=PM (14 , M4) , MX0=DM(I0,M0) ; 

{MAC these, fetch next} 



MR=MR+MX * MY (RND) ; 
IF MV SAT MR; 
TX0 = MR1; 
RTI; 



.ENDMOD; 



{Nth pass with rounding} 
{saturate if overflowed} 
{write to sport transmit} 
{return from interrupt} 



Listing 3-3: Interrupt Routine Example - 





. const sys_cont_reg=0x3FFF ; 
.const dm_wait_reg=0x3FFE; 
.const sport0_cont_reg=0x3FF6; 
.const sport0_sclkdiv=0x3FF5; 
.const sport0_rfsdiv=0x3FF4; 
.const taps=15; 

Listing 3-4: Include File, Constant Initialization 



In summary, the Assembler: 

• supports high level constructs, 

• encourages modular code development, and 

• provides a full range of diagnostics 
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3.2.3 The Linker 



The ADSP-2101 Linker generates a complete executable program by linking t ogether program 
modules which were assembled separately. It can search libraries, which are simply subdirec- 
tories, for subroutines to link. The output of the Linker is used by the Emulator, Simulator and 
PROM Splitter. Figure 3-4, on the following page, shows the files read and crea :ed by the Linker. 

As shown in the previous section, the Assembler processes each souxe code module 
separately, producing an Object file (.OBJ), a Code file (.CDE) and an Initializ ation file (.INT), 
which contains information on the assembled code, source level declarations and initialization 
information. Initialization data files (.DAT) are created separately. Changes in initialization 
data only require relinking. 

The Assembler output files (one set for each module to be linked), together with initial- 
ization data files and the Architecture Description file are used by the Linker. The Linker expects 
to find an Architecture Description file with the default name 210x.ACH unle ss the user alters 
this name with a switch; the files :o be linked must be specified in the invocation command or 



The Linker creates one complete executable code file by resolving external references 
and assigning addresses to relocalable code and data spaces. 

The Linker can generate three files. The Memory Image file (.EXE) is always created, 
and contains the actual program memory, data memory, and boot memory images after the 
linkage. This file is used by the Simulator and Emulator, and is also passed to the PROM Splitter 
to prepare a data file for a PROM burner. It has the default name 210x.EXE which can also be 
changed with a switch. 



The optional map listing file ;.M AP) assists the user in interpreting the res alt of the linkage. 
This file is discussed in more detail later in this section. The optional debug symbol table file 
(.SYM) lists all symbols encountered by the Linker, their absolute values and their scope of 
reference. This file is used by the Simulator and Emulator. 



The Linker can link togethei an unlimited number of modules and initial; zation data files. 
The initialization data files (.DAT) are not explicitly named in the invocation line because they 
are specified (with the .INIT directive) in the source code files. The data files are incorporated 
by the Linker. When changes are made in the data files, simply relink the modules to incorporate 
the new data file. 

Running the Linker 

To invoke the Linker from the host system, the command form is: 
LD21 fllel [f±le2 . ] [-switch .] 



LD21 -i file_all [-switch .] 

The -i switch causes the Lin ker to read the file file all for a list of files to link. The file 
containing the list of files to link must be a simple text file with one pathname/file per line. 




or 



68 



Overview of Development Tools Chap. 3 



In the first form, the user exp licitly names all the files to be linked (sepaiated by spaces). 
In both forms, the filename(s) must identify the Assembler output files (.CDE, .OBJ and .INT) 
without any extension. Modules to link are searched for in the current directory or in the pathname 
specified in the command line. 



Linker Switches 

The switch component of the invocation command can have any of the Linker switches (sep- 
arated by spaces). The Linker switches are listed below in Table 3-5; some require arguments 
as shown. This list can be displayed on monitor by invoking the Assembler with no filename 
or switches: LD21. A complete description of these switches is given in ADSP-2101 Cross- 
[2]. 



Switch 



Result 



-a archname 



-dry run 
-e target 
-g 

-i file all 
-lib directory; 



-old 
-P 

-pmstack 

-s stack size 
-x 

Table 3-5: Linker Switches ■ 



Use archname ACH Architecture Descrip- 
tion file instead of default 21Cx.ACH 
Linker creates "top of RAM" symbol to locate 
the stack; this symbol is used by programs 
generated with the ADSP-2101 C Compiler 
(See Chapter 7) 

Linker does not generate an .EXE file; quick 

test to check for link errors 

Output files named target.EXE, instead of 

default210x.EXE 

Linker generates a debugger symbol table, 
.SYM file 

Links all files listed in text file file_a.ll 

Directories listed are added to those found in 

ADIL environment setting for locating 

libraries; multiple directories are separated 

by commas in Unix systems or by semicolons 

in PC-DOS systems 

Not used (ADSP-2100 feature) 

Library subroutines are assigned to the boot 

pages that call them 

Used with -c switch; moves ' top of RAM" 
symbol to program memory 
Used with -c switch; specify a maximum size 
for stack 

Linker generates a. MAP file 



Sec. 3.2 SOFTWARE DEVELOPMENT TOOLS 



69 



MAP Listing File 

The Map Listing file is generated to help the user interpret the Linker result. The file provides 
information on: 

• Symbols 

A cross-reference listing of all symbols encountered, arranged by module. For each 
module a list is shown of the symbols referenced in that module, with the following 
information for each symbol: its absolute address, its length, the type of symbol (mcdule, 
variable, or label) , and the type of memory (PM, DM, or BM). 

• Memory segments 

A map of the physical memory segments declared for the system with the abs olute 
address, length, and attributes of each. The information here reflects the content of the 
Architecture Description file. 

• Boot memory & Run-time program memory 

An address map of modules and data structures on each boot page, and the correspo iding 
map of booted code in internal program memory ("bootable run-lime prcgram 
memory"). Information on PROM byte addresses and boot PROM sizes required is also 
provided. 

• Fixed vs. Dynamic memory 

Maps of fixed program memory, dynamic data memory, and fixed data memory. These 
maps include address, length, and attribute specifications. 

• Error messages 
Linker error messages. 

• Libraries 

A list of libraries searched and used. 

A sample Map Listing file is shown in Listing 3-5. 



ADSP-210x Linker, version 2.02, copyright Analog Devices, Inc. 

21 Ox (210x.exe) mapped according to EZLAB SYS (ezlabl.ach) 

— 



xref for module: MAIN_ROUTINE 
MAIN_ROUTINE 
DATA_BUFFER 
COEFFICIENT 
RESTARTER 
CLEAR_BOFFER 
WAIT 

FIR_START 

xref for module: FIR_ROUTINE 
F I R_ROUT INE 
FIR START 



pm 0000 
dm 3800 
pm 0040 
pm 001C 
pm 0024 
pm 0031 
0033 



[0033] 
[000F] 
[000F] 



[0000] 



pm 0033 [000A] 
pm 0033 



module (global) 

variable (global) 

variable (global) 

label 

label 

label 

extern (FIR_ROUTINE) 



module (global) 
label 
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CONVOLUTION pm 0038 label 

COEFFICIENT 0040 [000F] extern (11AIN_R0UTINE) 

DATA_BUFFER 3800 [000F] extern (11AIN_R0UTINE) 



210x memory per EZLAB_SYS lezlabl.ach) 

0000 - 07FF [ 2048.] pm ram data/code INT_PM 
3800 - 3BFF [ 1024.] dm ram data INT_DM 



boot memory and bootable run time program memory map: 




■ 0032 [ 51 . ] pm ram module MAINJROUTINE oi: MAIN_ROUTINE 

0033 - 003C [ 10 . ] pm ram module FIR_ROUTINE of FIR_ROUTINE 

0040 - 004E [ 15.] pm ram circ variable COEFFICIENT of MAIN_ROUTINE 

fixed program memory rom: 0. 
fixed program memory ram: 76. 

dynamic data memory map: 

fixed data memory map : 

3800 - 380E [ 15.] dm ram circ variable DATA_BUFFER of MAIN_ROU- 

TINE 

fixed data memory rom: 0. 

fixed data memory ram: 15. 

Listing 3-5: MAP Jsting File : 



In summary, the Linker: 

• supports multi-module linking, and 

• maps the Assembler output to the system architecture. 



3.2.4 The Simulator 

The ADSP-2101 Simulator is an interactive window-oriented software tool for instruction level 
simulation and debugging of a user program. The Simulator configures itself according to the 
user's target system architecture ;.s defined in the Architecture Description f le (.ACH). This 
allows it to flag illegal operations such as reading from non-existent memory. Using the symbol 
table created by the Linker, the Simulator is able to provide a fully symbolic environment for 
simulation and debugging. 

Briefly, the Simulator provides the following functions: 

• Instruction level simulation of booting and execution 

• Simulation of ports and S 3 0RTs using host data files 

• Simulation of internal and external interrupts 

• Complete assembly and disassembly of the ADSP-2101 instruction set 

• Multiple break conditions including break at address, break on condition, break on 
expression and break on address ranges 

• Full view of all processor registers and the ability to directly change any register's 
contents interactively 



Sec. 3.2 SOFTWARE DEVELOPMENT TOOLS 



71 



The simulator uses a variety of files as shown in Figure 3-5. The inputs are the archite cture 
description file (.ACH), the Program/Data Memory image (.EXE) and debug symbol table 
(.SYM) generated by the Linker, simulated input data buffers (.DAT) downloaded from flu; host 
computer, and simulated inputs. Simulator outputs are processed signals uploaded to the host 
and simulated outputs to I/O ports. The primary Simulator diagnostics are displays of pre gram 
execution, register contents and chip status, stacks, PM/DM contents, etc. Upon first be oting 
the Simulator, the user sees the command window display as shown in Figure 3-6. From this 
window, the user can open, configure and use all other features of the simulator. Typing [5S)-W 
(control- w) displays a menu of window commands including, for example, OPEN, wh ich in 
turn displays a submenu of windows to be opened. 



Architecture 
Description 
File (.ACH) 



Simulator 
Configuration Files 
(Optional) 




Command Input 
& 

Information Display 



I/O Port & SPORT 
Data Files (.DAT) 



Temporary C ache 
Files 
(BOOT.CAC, 



Figure 3-5: Simulator I/O 
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Open 

Move 

Size 

Close 

Hide 



COMMAND (DISPLAYED IN DEC) 



Window commands ~X# Go t o window! A Z Go to next window 



Simulator Command Window 



The user can customize the contents and layout of many windows, the arrangement of 
multiple windows on the screen c.nd the command strings used to invoke various Simulator 
functions. All customized settings can be stored in an external file and invoked automatically 
upon startup. A complete illustrative example of screen configuration and window manipulation 
is given in Chapter 4 section 4.3.1. For details, consult the Cross-Software Manual [2]. 

Invoking The Simulator 

The Simulator invocation command is: 

sim2101 [-a archname] [ -w window] [-s script] 

The -a switch uses a unique .ACH Architecture Description file that is used to link programs. 
The Simulator configures itself according to this target architecture. A default filename 
210x.ACH is assumed. The optional -w switch identifies a .WIN file containing a stored 
windows configuration which is loaded as the initial display when the simulator is first booted. 
The optional -s switch identifies £. file containing Simulator commands to be: executed auto- 
matically upon startup. 

Simulator Function Overview 

The Simulator generally provides multiple methods for achieving a given result. For example, 
there are two different methods fcr setting breakpoints in program memory. Consequently, it 
makes sense to think of the Simulator's functions rather than command structure. The functional 
capabilities of the Simulator are grouped into these broad classes: 
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• Interface management functions 

These functions include the opening and closing of windows, changing the size and 
position of windows, and changing the appearance of a window (removing or idding 
items to the window and rearranging the items displayed within the window's space). 
Additional functions described under this heading include navigating from window to 
window. Saving specific window configurations is possible and is described in the next 
chapter. Aliasing commands is another aspect of interface management. 

• Set-up functions 

These include loading the program to be simulated, opening I/O ports and asso :iating 
data files with I/O ports and SPORTs for the purposes of simulating input and output 
data streams. Also included is the configuring of simulated interrupts. 

• Register inspection & change functions 

These functions allow you to view the contents of all the registers in the processor and, 
in most cases, to change their contents directly if desired. Several windows are dedicated 
to register displays. 

• Memory inspection & change functions 

These functions include simple display of the various memory spaces (as either data or 
code), saving the contents of memory to files for later analysis and plotting the cc intents 
of data memory. 

• Simulator control & debugging functions 

Control functions include starting and stopping the execution of your program and 
resetting the simulated processor. Debugging functions include setting break] joints, 
break conditions and watchpoints. The Simulator supports a wide variety of break 
expressions for debugging purposes. 

A more detailed discussion including examples is given in Chapter 4. In summary, the 
Simulator: 

• provides an interactive and user-friendly interface, 

• supports full symbolic assembly and disassembly, 

• simulates hardware configuration, 

• simulates interrupt and I/O handling, 

• flags illegal operations, and 

• displays the internal operations and status of the processor. 



3.2.5 The PROM Splitter 

The ADSP-2101 PROM Splitter extracts the address information and the contents of the ROM 
portion of the memory image file (.EXE) and formats the extracted images for uploading to 
PROM burners. 
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The PROM Splitter creates output files for program, data, and boot memory. Three usable 
files are created for PM to organize the PROMs in word addresses correspond ing to three-byte 
instructions. Two usable files are created for DM to organize any data PROMs in terms of 
two-byte data words. One usable file is created for BM, which is physically byle-wide although 
organized internally in vertical groups of four bytes per word address. Both program and data 
memory can also be optionally output as a single stream of by tes for vertical rathe r than horizontal 
grouping of words in the PROMs. 

Since the PROM Splitter is more appropriate for commercial use, we will not discuss it 
any further. For more information, consult the Cross-Software Manual [2]. 

3.3 HARDWARE DEVELOPMENT TOOLS 

The hardware development tools for the ADSP-2101 consists of emulators in various 
degrees of capabilities from high end to low end. A full function emulator with access to various 
parts of the chip is manufactured by Microtek. At the intermediate level is an in circuit emulator 
called EZ-ICE™ with moderate control over the processor. At the low end is a low cost eval- 
uation system called EZ-LAB™ which can be used as a demonstration as well as a target system. 
In this section we briefly describe ihe EZ-Tools consisting of the EZ-ICE and EZ-LAB systems 
since they are appropriate in an educational environment. 

3.3. 1 The In-Circuit Emulator - EZ-ICE™ 

The ADSP-2101 EZ-ICE is a compact, easy-to-use in-circuit emulator for debugging code and 
testing ADSP-2101 based systems It is a 3.3" x 3.3" in-circuit probe board containing a 121-pin 
emulator version of the ADSP-2101 . A 68 -pin PGA footprint protrudes from the bottom of the 
board. These pins are inserted into a socket in a target system. EZ-ICE requires a +5 V dc power 
supply capable of supplying 1 A cf current. 

EZ-ICE can be run at full speed (12.5 MIPS). There i s no degradatior of ADSP-2101 
performance or signal timing othe r than BR, BG and RESET. The user can select via a jumper 
either the target system clock or the EZ-ICE clock. The oscillator is socketed to allow the use 
of other oscillator devices to achieve different clock speeds. 

For display and input, EZ-ICE requires either a VT100 as the terminal device or a personal 
computer (PC) running a terminal emulation program. The program should be capable of 
emulating a VTlOO-type terminal and allow the transfer of ASCII files between the PC and 
EZ-ICE. EZ-ICE is connected to the PC via an RS-232 cable. It automatically adjusts its baud 
rate to match the host PC's baud rate. A baud rate of 9600 or 19200 is recommended. 

The user has the option of running ADSP-2101 programs from target system memory, 
emulator overlay memory or a combination of both. The 8 K by 24-bit overlay program/data 
memory option is jumper selectable. 

The monitor firmware in EZ-ICE controls all emulator functions. The user interface is 
simple. There are no commands to remember; everything is menu- or cursor-controlled. In 
addition, EZ-ICE firmware intercepts illegal user inputs, making debugging work easier. 



Sec. 3.3 HARDWARE DEVELOPMENT TOOLS 



75 



Control and debug features include single-step capabilities with or without regis er dis- 
plays and a multiple breakpoint capability (with up to 16 breakpoints individually set). 

At power-up, the host processor is automatically reset and a diagnostic check is performed 
to ensure that both host memory and EZ-ICE are functional. A report of any failures found is 
automatically displayed. 

EZ-ICE Features 

The following is a summary of the EZ-ICE features: 

• Stand alone operation via RS-232 to a PC 

• In-Circuit self-emulation plugs directly into a target board 

• Full speed emulation 

• Single step capability 

• Multiple break points 

• 8K overlay memory 



EZ-ICE vs. Full Featured Emulator 

EZ-ICE is a basic, easy-to-use emulator that does not have some of the advanced features of 
the ADSP-2101 full featured emulator. It has the following limitations: 

• No trace capability 

• No hardware event triggers 

• Breakpoints for program memory instructions only 

• No data memory breakpoint (watchpoints) 

• Can not monitor the state of all ADSP-2101 pins 

• Can not plot memory contents 

• No symbolic debug and online assembly is provided by emulator firmware, only dis- 
assembly on program memory reads 

• No session record is kept by emulator firmware 

• Can not use breakpoints with internal ADSP-2101 memory while in GO mode. 

EZ-ICE Functions 

The EZ-ICE emulator uses Simulator-compatible memory image files. However EZ-ICE can 
run stand alone unlike the Simulator which runs only on the host. The emulator has the cap ability 
to set software breakpoints, upload and download PM and DM to and from the host, inspect 
and change all internal registers, halt execution and single-step, and switch memory banks 
between emulator and target. 

The hardware component layout is shown in Figure 3-7. 

The basic command menu of the emulator firmware is shown in Figure 3-8 It is discussed 
in more detail in Chapter 4. 
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Figure 3-7: EZ-ICE Component Lay Out 

3.3.2 The Evaluation System - EZ-LAB™ 

The ADSP-2101 EZ-LAB demons ration and evaluation board is a complete system on a 4 1/2" 
by 6" board. It allows the user to test coded applications in real time. No host or PC is needed 
to operate EZ-LAB. At reset, the ^DSP-2101 on EZ-LAB boots code and program memory 
data into its internal program memory from a 64 K x 8-bit EPROM. It then executes the code. 
EZ-LAB is capable of stand-alone; operation. It requires a +5 V dc power supply capable of 
supplying 500 mA and a ±1 2 V dc power supply capable of supplying 200 mA with a common 
power return. 
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"E Z - I C E" ADSP-2101 Emulator 
Basic Command Menu 

Read/Write Registers 

Read Data Memory 
Write Data Memory 
Read Program Memory 
Write Program Memory 
Reset Processor 
Single Step 

Set, Clear, List Breakpoints 
Run Until Breakpoint 
Boot From External EPROM 
Download From Host 
Upload To Host 
Display Stacks 



Figure 3-8: EZ-ICE Basic Command Menu 



A codec is attached to one of the serial ports. The other serial port can be configuied for 
interrupts and flags by changing on-board jumpers. The input signal to the codec can be a 
microphone, signal generator or any other high impedance source, and the resulting output signal 
can drive a small speaker. 

EZ-LAB has four DAC outputs to connect to an oscilloscope for display. In addition, 
there is an expansion connector and a serial port connector. The connectors allow access to the 
ADSP-2101 's serial ports, external address bus, external data bus, control signals,, and interrupt 
lines. 

EZ-LAB provides manual control of several functions: pushbuttons assert the A DSP- 
2101 's IRQ2 interrupt and FLAG IN pins and an on-board hardware RESET switch reset EZ- 
LAB. 

As a hardware development system for this book, we will combine EZ-LAB and EZ-ICE 
to form a high speed DSP workstation with an interactive, window-based debugging interface. 
This can be done by simply removing the ADSP-2101 device from the EZ-LAB board and 
plugging in EZ-ICE. This combination will allow us to prototype and evaluate our experiments 
and application with virtually no initial time investment in hardware design. 



Overview of I 



EZ-LAB Features 

The following a summary of EZ-LAB' features: 

• ADSP-2101 12.5 MHz microcomputer, 

• programs can be loaded into the internal program memory of the ADSP-2101 from a 
standard, low-cost EPROM; no external memory board is needed, 

• manual control of several EZ-LAB functions, 

• codec connected to SPOR'TO, 

• the input signal to the codec can be a microphone, or signal generator, and the resulting 
output signal can drive a small speaker, 

• a small size circuit board. 

The EZ-LAB board layout is shown in Figure 3-9. 
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Figure 3-9: EZ-LAB Board LAyout 
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3.4 HOST PC REQUIREMENTS 

The software development tools described above are run on a host computer. Although this 
host can be any computer platform, it is convenient to use personal computer in an educational 
environment of a digital signal processing laboratory. 

The ADSP-2101 Cross-Software executes on a IBM- AT or compatible F'C with a hard 
disk. Installation of the Cross-Software requires PC-DOS 3.0 or later, 640 KBytes Memory, 
and a color display system : CGA, EGA, or VGA type. 

Although the emulator program executes from firmware on the board, it sends menus, the 
results of operations, and files to the host. Similarly it accepts instructions and files from the 
host. This is done through the serial port of the PC via a RS-232 cable. Therefore the host PC 
must run a communications program, such as Procomm, to imitate a VT100 type terminal. The 
communications program must be capable of setting the baud rate for the serial port at 9600 or 
19200. The communications parameters are: no parity, 8 data bits, and 1 stop bit. The com- 
munications program should also be capable of transferring files between emulator and host 
using an ASCII format 

Finally to create source program files, the PC must have a screen editor. 



3.5 SUMMARY 

In this chapter we described an important element in the environment of the ADSP-2101 based 
systems: the development system. Several software tools from system builder to PROM splitter 
were explained and two hardware tools were discussed. Requirements on a host PC were also 
described. 



chapter 4 



GETTING STARTED WITH 
THE ADSP-2101 



4.1 INTRODUCTION 

In the first three chapters, we descri Ded hardware, software as well as development tools required 
to understand, program and integrate the ADSP-2101 microcomputer in a laboratory environ- 
ment. Even though it is possible to comprehend this description after reading these chapters, 
it may be difficult, to use this enormous information effectively in any application. Each aspect 
of the system development must be carefully studied and practiced to gain insight and experience. 
This chapter is designed to provide such an experience. It provides hands-on training on 
important aspects of system development so that the students can be brought up to minimum 
speed in order to undertake more elaborate experiments and projects. 

In Section 4.2, we begin with the System Builder tool which describes the target 
ADSP-2101 system. This is an imp ortant aspect in an overall design. An accurate and complete 
description of the target system avoids unwarranted access to memory or I/O locations and 
optimizes program coding. After providing a complete description of System Builder directives, 
we discuss three sample target sys :ems and explain how to write their architecture description 
file. We complete this section wit i more target systems and exercises. 

Section 4.3 deals with the Simulator which is a very important software tool. It allows 
us to perform instruction-level simulation. Its user interface is both interactive and symbolic. 
We give a sample session to acquaint the student of its various windows and customized screens 
and commands. A complete window command description is then given. The Simulator is also 
an excellent vehicle to leam the instruction set of the ADSP-2101. Therefore we provide an 
instruction set work-out involving various types of instructions. 

The hardware emulator, the EZ-ICE, is studied in Section 4.4. It is the final link in the 
overall development cycle. In the section, we first provide a discussion on the p roper power-up 
and power-down operational sequences for the emulator. A complete description of its firmware 
and window commands then follows. 
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4.2 SYSTEM BUILDER 

The System Builder is a software tool which is used to describe the target system. The 
ADSP-2101 is located in this target system which supports it and might contain A/D and D/A 
converters, a ROM to hold program memory and several RAMs for proper execution and 
operation. It must be pointed out that this target system is not unique and that every product or 
application that is built around the ADSP-2101 microcomputer has its own targe : environment. 
Therefore the ADSP-2101 must be aware of its environment so that it will not access memory 
locations that are non-existent or interface with unconnected I/O devices. This is done through 
a System Specification File (.SYS) which is created using a text editor on a computer. The 
system specification file specifies the amount of RAM and ROM available, the allocation of 
program and data memory, memory-mapped I/O ports, and a host of other important information. 
The System Builder processes the .SYS file and generates an Architecture Description File 
(.ACH). This file is used by the Linker to resolve relocatable program and/or data memory. It 
is also used by the Simulator and the Emulator to setup a target environment. 

In this section, we will examine some sample target systems and write their specification 
files. The specification file is essentially a program written using System Builder directives. 
Therefore we will begin with a detailed description of these directives. We will then execute 
the System Builder on the sample system's specification files to generate architecture description 
.ACH files. Finally we will provide some exercises on the System Builder. 

4.2. 1 System Builder Directives 

This section describes each system builder directive and its syntax. 
.SYSTEM Directive 

The .SYSTEM directive must be the first statement in the System Specification source 
file. The identifier name given as its argument is the name of the system displayed in the 
Simulator. The .SYSTEM directive has the form: 
.SYSTEM system_name\ 

.ENDSYS Directive 

The .ENDSYS directive must be the last statement in the file. The System Builder pro- 
cessing terminates at the .ENDSYS directive statement. The .ENDSYS directive has the 
form: 

.ENDSYS; 

.ADSP2101 Directive 

This directive identifies the processor. Its use is mandatory to clearly differentiate between 
ADSP-2l00-based and ADSP-2l01-based systems. If the directive is not present, the 
Cross-Software system assumes that the processor is an ADSP-2100. 
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.CONST Directive 

The .CONST directive defines System Builder constants. Once you declare a constant, 
you may use it in place of its numeric value. This symbolic constant is recognized only 
by the System Builder, the definition is not carried over to the Assembler or Simulator. 
The .CONST directive has the form: 
.CONST constant jiame = constant or expression, ... ; 

A single .CONST directive may declare one or several constants, separa ted by commas. 
If you wish to define the value 15 for the term taps, for example, the directive would be 
as follows: 

. CONST taps = 15; 

.PORT Directive 

The .PORT directive declares a memory-mapped parallel I/O port. Ports can be placed in 
either data or program memory, and must be declared in one or the other. The directive 
takes the absolute physical address of the I/O port as a modifier, and the symbolic name 
of the port as an argument. The .PORT directive has the form: 
.PORT/qualifier ... portname; 
There are two required qualifiers: 

PM or DM (in which memory space) 

ABS=address (absolute address (constant)) 

The port address is specified by a constant; port name is an identifier. For example, 

.PORT/DM/ABS=0x0400 ad_ sample; 

declares a port identified as ad sample located at absolute data memory address 1024 
(decimal). Assembler references to this same symbolic name are correct y interpreted by 
the Linker, using the .ACH file information. 

.MMAP Directive 

The .MMAP directive specifies the state of the MMAP pin on the ADSP-2101. It has the 
form .MMAPO (MMAP pin held LO) or .MMAP1 (MMAP pin held HI}. If .MMAPO is 
used, boot loading takes place and on-chip program memory begins at address zero. If 
.MMAP1 is used, no boot loading takes place and on-chip program memory is mapped 
at the top of the program memory space. When this directive is omitted, the default is to 
.MMAPO. 
.SEG Directive 

The .SEG directive names a specific section of physical memory in the target system, and 
describes its attributes. In effect, the default memory map from the perspective of the 
System Builder is no memory at all. Until you declare and define a memory segment it 
does not exist. The .SEG directive has the form: 
.SEG/qualifier ... segjiame [length]; 
The following qualifiers are mandatory: 

PM or DM or BOOT=0, 1 , 2, 3, 4, 5, 6, 7 (in which memory space) 

RAM or ROM (memory type) 
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While the following are optional: 
ABS=address (at 
DATA or CODE or DATA/CODE (w 
Segname is an identifier; length, which must be a a 




brackets, is the number of words in the segment. 

The .SEG directive declares three types of memory segments: program memory (PM), 
data memory (DM) and boot memory (BOOT). Qualifiers may specify the absolute start 
address of the segment, the physical memory type (RAM or ROM) and what is stored 
(DATA and/or CODE). 

PM memory segments can be either CODE only, DATA only, or both CODE and DATA 
(defaults to CODE). For a PM segment that contains code and data, both modifiers must 
be used in the directive statement. The processor requires that any data access to PM must 
be made to sections with the DATA attribute. If a system requires that executable code 
be read or written by the processor as data, these sections should be declared with both 
CODE and DATA attributes. 

DM memory segments must be DATA only. Therefore, the /DATA modifier can be 
omitted. An error is generated if a DM segment is assigned the CODE attribute. 
BOOT memory segments may be either ROM- or RAM-type; in most systems, however, 
the boot memory chips are PROM and all BOOT segments are specified as ROM-type. 
Boot memory always defaults to both CODE and DATA; the CODE and DATA attributes 
are unnecessary. The BOOT modifier always specifies the page number, for example, 
BOOT=0. A system may have up to 8 boot pages, with page numbers from to 7. Each 
page can hold up to 2K words of code and data. The System Builder knows how long a 
page can be and the possible boundaries for each page; it ignores the ABS modifier for 
boot pages. An individual declaration must be made for each boot page required. 
Memory segments are assigned symbolic names. In the Assembler you may locate indi- 
vidual code modules and data objects (buffers and variables) in segments by name. The 
Assembler accepts the segment references; the Linker resolves them using the .ACH file. 

The length of the segment is specified by the bracketed expression, as in some data[l 024]. 
The unit is always words, either 16-bit data or 24-bit instructions. This means that data 
memory segment size in bytes is 2x the word count, program memory size in bytes is 3x 
the word count and boot memory size is 4x the word count. The latter reflects the padding 
of boot memory with an extraneous byte per instruction in order to place the beginning 
of every instruction on an even byte boundary. 

The example 

. SEG/BOOT=0 /ROM boot_mem [2048]; 

declares the boot segment, bootjnem, which is physical memory type ROM, residing in 
boot page zero (corresponding automatically to absolute address 0). The length of the 
segment is 2048 words corresponding to one page of boot memory. 
The example 

. CONST onchip_pm = 2048; 

. SEG/PM/RAM/ABS=0 /CODE/DATA int_pm [ onchip_j>m] ; 
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declares a program memory segment called int _pm, which is memory type RAM at 
absolute location 0. This segment may hold both code and data. The length of the segment 
is 2048 words. This corresponds to the ADSP-2101 on-chip program memory space. 

4.2.2 System Architecture Examples 

Target systems are generally provided using a logic block diagram. Sometime.', atext description 
is available. We will use both approaches to study target systems. 

Example 4-1: 

Consider a target system shown in Figure 4-1. 
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Figure 4-1: Example 4-1 
shown.) 



: Data, Address, and Control signals are not 



Observations: 

• The ADSP-2101 microcomputer is in the system. 

• The MMAP pin is lield LO. 

• 8K EPROM (8-bit I containing boot memory is connected. Since the program 
memory is 24-bit (3-byte) wide and since every 4th byte is not used (refer to 
Chapter 1 ), the boot memory length is 2K. 

• 8K external data memory ( 1 6-bit) is selected when address line A 13 is zero. Hence 
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the memory chip will be selected at any value below 2 I3 = 0x2000 = 8196 
beginning at absolute address (refer to Data Memory Map in Chapter 1). 

• 8K of external program memory (24-bit) is selected when address line Al 1 is 
one. Hence the memory chips will be selected beginning at absolute address 
0x0800 or 2048 (refer to program memory map for MMAP=0 in Chapter 1 ). 

• This example system does not have any I/O ports declared. 

The System Specification File for describing the above system is shown below. Let us 
call it EXMPL_1.SYS. Note that program lines must end with a semicolon (;) and that 
comment fields are enclosed within braces, j } . 



.SYSTEM EXAMPLE_1; {System Name} 

.ADSP2101; {ADSP-2101 system} 

. MMAP0 ; {Enable Boot loading} 

. SEG/BOOT=0/ROM boot_mem[2048] ; {Boot Segment} 

. SEG/RAM/ABS=0x3800/DM/DATA int_dm[1024] 

. SEG/RAM/ABS=0 /DM/DATA ext_dm[8192] 

. SEG/RAM/ABS=0/PM/ CODE/DATA int_pm[2048] 
. SEG/RAM/ABS=0x0800/PM/CODE/DATA ext_pm[8192] 



{Internal Data Memory} 
{External Data Memory} 
{Internal Program Memory} 
{External Program Memory} 



. ENDS YS ; {End of File} 

Explanations: 

• The first directive must be a .SYSTEM which also gives a name to the system. 

• The second directive establishes an ADSP-2101 system. 

• The third directive specifies that boot loading will take place upon reset and that 
on-chip program memory will begin at address zero. 

• The fourth directive identifies a 2K-word space for one page of external boot 
memory. 

• The next two directives declare data memory space. The first of these declares 
the 1 K-word on-chip data memory which always begins at address 0x3800 or 
14336 (decimal). The second one declares the 8K-word external memory 
beginning at address zero. 

• The next two directives declare program memory space. The first of these declares 
the 2K-word on-chip program memory which begins at address zero since 
MMAP=0. The second one declares the 8K-word external memory beginning at 
address 0x0800 or 2048 (decimal). 

• Finally, the last directive signals the end of the target system description file. 

The architecture description file, EXMPL_1.ACH, can now be generated by executing 
System Builder on EXMPL_1.SYS. 

BLD21 EXMPL_1 

The system Builder responds with error messages, if any, or with a summary of the 
architecture created to the screen, which is shown below. Note that ACH file is a binary 
file and as such should not be printed or listed. 
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boot memory 

0000-07ff [0800] Rom code/data BOOT_MEM 
program memory 
0000-07ff [0800] 
0800-27ff [2000] 
data memory 
0000-lfff [2000] 
3800-3bff [0400] 



ram data INT_PM 
ram code/data 24 required bits in 



ram data 
ram data 



EXT_DM 
INT DM 



The System Builder alerts the user with a message "24 requ 
EXT_PM" which is satisfied by the above system. 

In the next example, we will include some I/O ports in the target system. 



EXT_PM 



bits ir memory width 



Example 4-2: 



Figure 4-2 shows another target system in a block diagram form. 
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Figure 4-2: Example 4-2 System Architecture (Note: Data, Address, and Control signals are not 
shown.) 

Observations: 

• The ADSP-2101 microcomputer is in the system. 

• The MMAP pin is held LO. 

• 8K EPROM (8-bit) containing boot memory is connected. Since the program 
memory is 24-bit (3-byte) wide and since every 4th byte is not used, the boot 
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memory length is 2K. 

• 8K external data memory ( 1 6-bit) is selected when address line A 1 3 is zero. Hence 
the memory chip will be selected at any value below 2 I3 = 0x2000 = 8192 
beginning at absolute address 0. 

• A/D converter is selected when both A2 and A 1 3 are high. Hence the absolute 
address for A/D converter is 2 13 + 2 2 = 0x2004 = 8196. 

• D/A converter is selected when both Al and A13 are high. Hence the absolute 
address for D/A converter is 2 13 + 2 1 = 0x2002 = 8 1 94. 

• A parallel I/O port is found at 0x2001. This is selected when both AO and A 13 
are High. 

• This example system does not have any external program memory. 
The System Specification File, EXMPL_2.SYS is shown below. 



. SYSTEM 
.ADSP2101; 
. MMAP0 ; 

. SEG/BOOT=0/ROM 

. SEG/RAM/ABS=0x3800/DM/DATA 

. SEG /RAM/ ABS=0 /DM/DATA 

. SEG/RAM/ABS=0/PM/DATA/ CODE 

. PORT/ABS=0x2 04 /DM 

. PORT/ABS=0x2 02 /DM 

.PORT/ABS=0x2001/DM 

.ENDSYS; 



EXAMPLE 2; 



boot_mem[2048] , 
int_dm[1024] , 
ext_dm[8192] 
int_pm[2048] ; 
ad_converter ; 
da_converter ; 
io_port ; 



{System Name} 
{ADSP-2101 System} 
{Enable Boot Loading} 
{ Boot Segment } 
{Internal Data Memory} 
{External Data Memory} 
{Internal Program Memory} 
{A/D Converter) 
{D/A Converter} 
{Parallel I/O Port} 
{End of File} 



The architecture description file, EXMPL_2.ACH, can now be generated by executing 
System Builder on EXMPL_2.SYS. 



BLD21 EXMPL 2 



A summary of the architecture created by the i 



own below. 



boot memory 

0000-07ff [0800] Rom code/data BOOT_MEM 
program memory 

0000-07ff [0800] ram code/data 24 required bits in memory width INT_PM 
data memory 

0000-lfff [2000] ram data EXT_DM 

2001- 2001 [0001] ram data IO_PORT 

2002- 2002 [0001] ram data DA_CONVERTER 
2004-2004 [0001] ram data AD_CONVERTER 
3800-3bff [0400] ram data INT_DM 



Example 4-3: 

In this example, a target system containing the ADSP-2101 is described as. follows: 

• Boot sequence disabled with MAP pin at HI level, 

• 4K external data memory (ext_dmem) in data memory location 0x1000, 

• 2K external program memory (ext_pmem) for both code and data beginning at 
0x0000, 

• Internal program memory is to be used for instruction code only, 
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• 2 A/D converters (ac!c_ 1 and adcjl) at data memory locations Ox.'iOOO and 0x300 1 
respectively, 

• One D/A converter (dac) at data memory location 0x3400. 
The corresponding System Specification File, EXMPL_3.SYS is: 

.SYSTEM EXAMPLE_3; {System Nanus} 

.ADSP2101; (ADSP-2101 ISystem} 

. MMAP1 ; {Disable Boot Loading} 

.SEG/RAM/ABS=0x3800/DM/DATA int_dmem[1024] ; {Internal Data Memory} 
. SEG/RAM/ABS=0xl000/DM/DATA ext_dmem [ 4 9 6 ] ; {External Data Memory} 
.SEG/RAM/ABS=0x3800/PM/CODE int j>mem[2048] ; {Internal Program Memory) 
. SEG/RAM/ABS=0/PM/CODE/DATA ext_pmem[2048] ; {External Program Memory} 
.PORT/DM/ABS=0x3000 adc_l; {A/D Converter 1} 

.PORT/DM/ABS=0x3001 adc_2; {A/D Converter 2} 

=0x3400 dac {D/A Converter} 

{End of Filii} 

A summary of the architecture created by the System Builder is shown below. 

boot memory 
program memory 

0000-07ff [0800] ram code/data 24 required bits in memory width EXT_PMEM 

3800-3fff [0800] ram code INT_PMEM 
data memory 

1000-lfff [1000] ram d*ita EXT_DMEM 

3000- 3000 [0001] ram data ADC_1 

3001- 3001 [0001] ram delta ADC 2 
3400-3400 [0001] ram data DAC 
3800-3bff [0400] ram data INT DMEM 



4.2.3 System Architecture Exercises 

The following exercises are designed to emphasize the description and use of the target system 
architecture. In each individual application, memory segments and I/O ports must be carefully 
described so that easier and better assembly language routines can be written. 

1 . Draw the program memory map, data memory map and write a System Specification 
file (.SYS) to describe the following ADSP-2101 hardware architec ture. Be sure to 
include the on-chip memory. 

• Boot sequence enable, 

• 2K words of boot memory (boot_mem), 

• 4 A/D converters (ad_l , ad_2, ad_3 and ad_4) at data memory locations 0x2000, 
0x2001, 0x2002 and 0x2003 respectively, 

• 4 D/A converters (da_l , da_2, da_3 and da_4) at data memory Ideations Ox 1 000, 
0x1001, 0x1002 and 0x1003 respectively, 

• 2K external data memory (dm_ext) in data memory location 0x3000, 

• 1 I/O port (in_out) f ound in data memory location 0x3400. 

2. A target system containing the ADSP-2101 is shown in Figure 4-3. Draw the program 
memory map, data memory map and write a System Specification file (.SYS) for this 
system. (Note: Data, Address, and Control signals are not shown.) 
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Figure 4-3: Exercise 2 System Architecture 

4.3 SIMULATOR 

The Simulator provides instruction-level simulation with a reconfigurable user interface. It is 
briefly described in Chapter 3 and a detailed description is available in the Cross Software 
Manual [2] . The Simulator is very useful not only in verifying the proper operation or debugging 
the improper one, but it is also useful in learning the architecture and instruction set of the 
ADSP-2101 chip. In this section, we will emphasize navigational and configurational aspects 
of the Simulator so that a user can get familiar with its environment. This is particularly useful 
when the ADSP-2101 instructions are studied in later subsections. More intricate functions of 
the Simulator will also be described briefly as they are needed in later chapters. We close this 
section with a few exercises on instruction sets using the Simulator environment. 

The best way to learn the Simulator is to begin using it, experiment with different window 
operations and commands, and then build up custom screens with required fields. At some 
point, perhaps as soon as a few hours after the first encounter, the user can organize all the 
custom screens and commands into a clean set of external files. Thereafter, the user can invoke 
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the Simulator with the appropriate startup file identified and the Simulator appears automatically 
in the desired configuration. After describing the necessary keywords for window configuration 
operations, we will provide a sample session of navigating through the Simulator. Although it 
is a very powerful tool in the entire development process, all its features cannot x learned unless 
we are debugging a complete program. Therefore we will emphasize some essential features 
of window commands and practice them through the study of ADSP-2101's instruction set. 

The DOS command to invoke the Simulator along with its switches is described in Chapter 
3. It is also given below: 

sim2101 [-a arch_file] [-w window_file] [-s script_:?ile] 

The Simulator configures itself according to the target architecture file arch Jile (default is 
2 lOx. ACH). It also uses two types of external configuration files: windows and scripts. Window 
files (default is DD.WIN) store look and layout of a particular set of windows. Scripts (default 
is STARTUP) are text files of command window inputs typically storing command aliases. In 
order to use the Simulator for program running or debugging, a Linker created symbol file 
(.SYM) must be available. It is created by using -g linker switch (described in Chapter 3). 

4.3. 1 Window Configurations 

The Simulator is a window oriented interactive software tool. Up to ten windows can be opened 
at the same time. At any one time, there in always an active (highlighted) w ndow. The first 
window , window #0, is called the Command Window. It is always present, cannot be deleted 
and it is used for accepting user input. Each newly opened window is assigned the next available 
window number. 

The Window Configuration Menu is invoked by pressing (Ctrl] and W (depicted by A W) 
keys simultaneously. It contains commands to open up a new window as well as moving, sizing, 
closing and hiding the active window. In the following discussion, note that we will use A to 
denote the Control key (Ctrl) and fEn ter^ to denote the Enter key. Each command can be invoked 
by highlighting the corresponding command and pressing [Entert^ j or by keying the first character 
of the command. 

Window Operations 

Open Windows 

To open any window: 

- Key A W to display the Window Commands Menu. 
Select OPEN. A submenu of window selection will appear. 

- Select the window you wish to open. 

- The selected window will become the active window and will be displayed at the upper 
left corner of the screen 

Select Active Window 

Any newly opened window will become the active window automatically. Th;re are two ways 
to select one of the opened windows as the active window. 
1. Cycle through and set window active: 
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- Key A Z to set the next window active in the numbered sequence. 
2. Set window active by number: 

- Key A X, followed by the window number and [ Enter*--] , to directly set the specified 
window active, 

e.g., A X3 [Enters ] will set window # 3 active. 

Move Windows 

To move any window: 

- Select the window you want to move as the active window. 

- Key A W to display the Window Commands Menu. 

- Select MOVE. 

- Use arrow key to move the window and press [ ' Enters ] when done. 

- An optional number, specifying the step size of the next arrow key, can be entered to 
speed up the moving process, 

e.g., 3 will move the window to the left by 3 columns. 

Size Window 

To size any window: 

Select the window you want to size as the active window. 

- key A W to display the Window Commands Menu. 

- Select SIZE. 

- Use arrow key to size the window and press [ Enters] when done. 

- An optional number, specifying the step size of the next arrow key, can be entered to 
speed up the sizing process, 

e.g., 3 © will extend the window to left by 3 columns. 

Close Window 

To close any window: 

Select the window you want to close as the active window. 

- Key A W to display the Window Commands Menu. 

- Select CLOSE. 

- The active window will be closed. 

Hide Window 

To hide any window: 

- Select the window you want to hide as the active window. 

- Key A W to display the Window Commands Menu. 

- Select HIDE. 

- The active window will be pushed to the bottom of the stack. 

Decimal and Hex Display 

To toggle between decimal and hex display 

Select the window you want to toggle the display. 

- Key A E to toggle between decimal and hex display. 
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Delete Window Field 

To delete a field in the active wir dow: 

- Select the field by moving the cursor onto it. 

- Key A D. 

- The field will disappear from the display. 

Undelete Window Field 

To undelete a field in the active window: 

- Move the cursor to a blank location in the window. 

- Key A U. 

- A menu of deleted fields for that window will appear. 

- Select the desired field or undeletion. 

- The deleted field will appear in that black location. 

Move Window Field 

To move a field in the active window: 

- Select the field by moving the cursor onto it. 

- Key A Y to toggle on this function. 

- Move the field, using the arrow keys, until it reaches the desired location. 

- Key A Y again or press f Enters ) to toggle off this function. 

Table 4.1 below lists the keys and control key sequences which allows user to navigate 
between different windows or within the active window. These keystrokes can be used in any 
window. 



Keystrokes 


. 

Description 


A W 

A X#( Enters ; 
Arrow keys, 
[PgUp], [PgDn] 


display main menu of window configuration actions 

exit a menu without making a selection 

move to next (consecutively numbered) window 

move to window # 

Scroll through text in a window 



Table 4-1 : Window Navigation Controls 

4.3.2 A Sample Sessio n 

The tools for configuring an individual window are briefly described in the Drevious section. 
We now present a sample session to spell them out in greater detail with an example. In the 
process, we will use some window commands which are described in the nexi section. Invoke 
the Simulator by typing (we are using architecture description file generated in the example 
4-1, however this is not essential and any .ACH can be used): 

SIM21 -a exmpl . ach 

A screen containing Command W indow now appears. 
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Opening Windows 

Key A W to display the menu shown in Figure 4-4. Select OPEN by typing the letter "O" or 
pressing [ Enter*- 1 ] since OPEN is the default selection on this menu. 



Cursor 



Move 
Size 
Close 
Hide 



Window Commands Menu (open with control-W) 



Command Window (Always Open) 



COMMAND 



"X# Go to window# 



"Z Go to next window 



Informational Display 



Figure 4-4: Main Menu For Configuring Windows 



Doing this displays the window selection submenu shown in Figure 4-5. This session 
uses the register window. Select the register window by moving the cursor down with the array 
keys and pressing (Jnter^J or by keying the menu letter "D". 

Figure 4-6 shows the default register window layout. This is the starting point for rear- 
ranging the fields of this window. 
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Cursor 



A program memory 

B data memory 

C program memory data 

E stack 

F cross reference 

G flag 

H break points 

I break expression 

J expression 

K sport register 

L sport status 

M status register 

N I/O status 

control register 

P help 

Q trace 

R profile 

S defaults 

T NEXT PAGE 



Window choices 



A W Window commands 



A X# Go to window# 



*Z Go to next windo 



Figure 4-5: Window Selection Submenu (with Register window selected) 
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Figure 4-6: Default Register Window Layout 
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Selecting , Deleting & Rearranging Fields In A Window 

Note that only the windows displaying individual fields, like the register window, can be 
rearranged. The internal layout of memory windows and informational windows (like the 
breakpoint windows) cannot be altered. 



For the purpose of illustration, let us assume that there are some registers in the processor 
which are never used. For example, only the SI register in the shifter is used (as a temparary 
holding register for signal data) and none of ALU registers are used. Likewise, only some of 
the DAG registers are used. Therefore we are going to delete unused registers from our display 
and rearrange the remaining registers for compactness. 

The first step is to make the register window the active window, if it is not already. Key 
A Z until it becomes the active window. The cursor now appears in the active window. Move 
the cursor with the arrow keys until it is over the SE field, one of the fields to be deleted. Key 
A D to delete this field and it disappears from the display. Now move the cursor and delete the 
SB, SRO, and SRI registers in the same way. Continue on to delete all of the ALU registers 
and the unneeded DAG registers: 12, 13, 15-7, Ml-3 and M5-7, and Ll-3 and L5-7. After these 
deletions, the register window looks like Figure 4-7. 
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Figure 4-7: Example Register Window with some registers deleted 

If a register is deleted accidently, it can easily be restored or "undeleted". Move the cursor 
to a blank spot in the window and key A U. A menu drops down that lists all the deleted registers. 
Move the cursor along the menu and press Enters ' to restore any register. Press (Esc) to abort 
the operation. Note that the contents of the deleted register are also displayed. If a deleted 
register has a value other than undefined, the value is visible in this menu. 
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Now we can rearrange our p runed-down set of registers for a more compact display. Move 
the cursor to the MXO register field. To move any field, select it for moving with A Y, move it 
using arrow keys, and deselect it with another A Y or [Enter«j. Move the MXO yield up to the top 
line of the window this way. Repeating this way, we rearrange all the fields of this register 
window example until the entire window appears as shown in Figure 4-8. 
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ire 4-8: Example Register Window with registers rearranged 



Now we will resize the window outline, bringing up the bottom edge to make that space 
on the screen available for other windows. Key A W to display the menu again and select size 
from it. As the up-arrow is press ed, the lower edge of the window moves up on the screen. 
Press [Enters] to end the sizing operation. The final version of this window should look like 

Saving A Rearranged Screen 

If we change the display without saving our custom register window, all the work creating it is 
lost. To save a new screen configuration with a desired set of windows opened, resized, and 
internally reconfigured, we must .store the screen in a file. The Simulator command Y stands 
for the display and the greater than (>) and the less than (<) symbols are directional pipes. The 
display files are given default file extension .WIN. To save the current display (such as our 
example in Figure 4-9) enter this command in the command window: 

y > ' myscreen' 

This stores the current display configuration in the file MYSCREEN. WIN in the default 
directory. We can recall this or any other display configuration with the command 

y < 'filename' 

where filename is the main filename of a screen/window file. 
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Closing Windows 

During the course of a simulation, we may not have use of a particular window but would 
like to observe another one. This is done by closing one window and opening another one. In 
our session, we want to close our custom Register Window and keep only the Command Window 
(which cannot be closed). The first step is to make the register window the active window, if 
it is not already. Key A Z until it becomes the active window. The cursor now appears in the 
active window. Key A W to display the menu shown in Figure 4-4. Select CLOSE by typing 
the letter "C". The window now disappears from the screen and the cursor returns to the 
Command Window. Close any other opened windows. 

Command Line Aliases 

A command alias is a character string (plus any required arguments) which replaces one of the 
Simulators's native commands. The J command creates the alias. The syntax of this command 
is 

j alias ' command' 

where alias is a symbol which will subsequently stand for the Simulator command enclosed in 
a single quotation marks. 

For example, we can create a command that can call up file MYSCREEN. WIN, our 
previously saved display configuration. Remember that we used the commanc 
y < 'myscreen' 

This may be inconvenient to type each time we want this display, especially because the "<" 
symbol requires the (Shift) key on most keyboards. We can alias this command with the new 
command string "VIEW" by entering the following: 
j view ' y < "myscreen" ' 

Note that the filename, myscreen, is enclosed with double quotes; nested quotation marks must 
be double, inside single. Some commands have arguments; these are passed using a dollar sign 
token as in $1,$2, etc. 
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4.3.3 Window Commands 



In the sample session of the last section, we used a window command to save the created screen. 
These commands are entered in the Command Window. They can be categorized into different 
Simulator functions as discussed ir Chapter 3. These functions include set-up functions, register 
functions, memory functions, control and debugging functions. In this section, we briefly 
describe commands from these functions. 

Simulator Set-up Functions 

These include loading the prograrr to be simulated, opening I/O ports and associating data files 
with I/O ports and SPORTs for the purposes of simulating input and output data streams. Some 
miscellaneous set-up functions arc also included. 

Loading a Program 

To load an ADSP-2101 .EXE and .SYM file, type 
L ' filename' 

where filename is the filename of the .EXE file. If .SYM is not found, a warning message 
appears and the Simulator will run without any labels and variables information. In other words, 
only the address will be displayed 

Opening and Closing an I/O Port 

To open or close an I/O port type 

O address [>' out file . ext' ] [< ' infile . ext' ] 

Giving the O command with no op itional parameters will close the port at the given address. 

Opening and Closing a Serial Port 

To open/close a serial port type 

P [>' out file. ext' ] [< ' infile . ext' ] 
P 1 [>' out file. ext' ] [< ' infile . ext' ] 

Giving the P command witn no op:ional parameters will close the corresponding serial port in 
either case. A on the command line refers to SPORT0 while a 1 signifies the operation is 
done on SPORT 1. 

Other Defaults 

There are a number of miscellaneous defaults for the operation of the Simulator. These can be 
changed in the defaults window. To change contents of the Default Window: 

- Key A W to display the Window Commands Menu. 

- Select OPEN. 

- Select Default Window. 

- Modify the contents of the Default Window. 
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Simulator Inspecting and Altering Register Functions 

The ADSP-2101 Simulator provides capabilities to inspect and modify different types of reg- 
isters stored in the register, SPORT, status register, control register and stack windows. The 
syntax of inspecting a register, altering a register and undefining a register is described below. 

Inspecting a Register 

Use ? to inspect a register. For example, 

? AXO 

is used to inspect the content of the AXO register 
Altering a Register 

You may use one of the two forms to alter the contents of any register: 

1 . Using the 'R' command 

R Register Expression 

where Register is the name of a processor register and expression is the value to be loaded 
into the register. 

2. Direct assignment, e.g., 

AXO = 0x002c 

will assign the value 0x002c to register AXO 
Undefining a Register 

The command 'U' is used to undefine any given register, e.g., 

u AXO 
will undefine register AXO 

Simulator Inspecting and Altering Memory Functions 

This section describes methods for viewing and altering specific locations in any of the 
ADSP-2101 's program, data and boot memory spaces. 

Inspecting a Memory Location 

Use one of the following to inspect memory spaces. 

1 . (PgUp) and [PgDn] key 

- Open the desired memory window or make it active if already opened. 

- Use (PgUp) or (PgDn) key to scroll up and down inside the window 

2. Use A G key 

- Open the desired memory window or make it active if already opened. 

- Use A G key and type in an address when prompted. 

- The window will display the input address. 
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3. The 'K' command in Command Window 

Set the active window be the Command Window. 

- Execute the 'K' command, e.g., 

K 1 0x2000 

will display the memory location 0x2000 of window # 1 . 

4. The 'D' command Window 

- Set the active window be the Command Window. 

- Execute the 'D' command, e.g., 

D address [> 'filename'] 
D range [> ' filename' ] 

where address or range specifies a location or range in memory. An opti onal redirection 
argument can be used to •edirect the output to a disk file. 

Tracking 

Tracking is used to view the code and data memory while single stepping or running a program. 
The instruction being executed will be indicated when tracking is on in the Program Memory 
Window. Any change in data memory will be reflected immediately when tracking is turned 
on in the Data Memory Window. There are two ways to toggle the tracking feature: 

1. The T command in Command Window 

- Select the Command Wirdow as the current window. 

- At the command prompt I ype 

T windownumber 

where windownumber is the number of the window you want to toggle the tracking 
effect. 

2. The A T key 

- Select the desired memory window as the current window. 

- Key A T will toggle tracking effect in the current window. 

Locating Symbols and Values 

There are two ways to locate symbols and values in the program memory spaces. 

1. The 'X' command 

- The 'X' command uses a symbol as argument and returns the address of the symbol 
if found. Note that symbols are case-sensitive, e.g., 

X restart 

will return the address of restart if found. 

2. The *F' command 

- The 'F' command require ; an address or range specifier as the first argument and an 
expression to be found as i second argument. It returns the address of the expression 
if found within the range, e.g., 

F pm[0]/100 jump start&r 



Sec. 4.3 SIMULATOR 



101 



will search through the first 100 program memory spaces starting with the beginning 
of program memory location 0. 

Altering a Memory Location 

The 'E' command can be used for altering memory spaces. The syntax is as follows 

E address expression 
or 

E address [< ' filename' ] 

where address is a valid address or range specifier, expression is the value to be deposited in 
the prescribed range, and filename is a data file. 

Altering Instructions 

The 'A' command can be used to alter instructions in program memory spaces without converting 
the instruction into opcode ( unlike the 'E' command ), e.g., 

A pm[2] nop 

will put a NOP instruction in pm[2] program memory location 
Undefining Memory Location 

The 'U' command can be used for undefining memory spaces. The syntax is as follows: 
U address 

where address is an individual address or an address range. 

Control and Debugging Functions 

Control functions include starting and stopping the execution of the program and resetting the 
simulated processor. Debugging functions include setting breakpoints, break conditions and 
watch points. 

Control Functions 

1 . Chip reset and Reset 

- The command CR simulates a hardware reset which includes a boot loading sequence. 

- The command RE is the same as CR except it ignores the boot-loading sequence. 

2. Single-step instructions 

The command 

S [ steps ize] 

executes the next stepsize instructions. It will go to the next instruction if the stepsize 
is omitted. 

3. Running and Halting the processor 

- Use the 'G' command to run a program. It will stop when any key is pressed. 
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- The 'RUNFAST' command is the same as a 'G' command, excep: it does not stop 
when a key is pressed. The program will stop upon detection of an error or when a 
break point is encountered. 

Break Functions 

These commands halt execution. The Simulator break functions include break, break expres- 
sions, break changes and break ranges in program memory spaces 
1. Setting break points and break ranges 
The command 

B address 

or 



B range 

halts at valid address or range specifier. 
2. Viewing preset breakpoints 
The command 

B address 



displays a listing of the breakpoints currently activated. 

3. Break expressions 

The command 

BE expression 

breaks whenever the expression is true. 

4. Break on changes 

The command 

BC expression 

breaks whenever the expression is changed. 

5. Delete breakpoints 

The command 

BD address 

deletes breakpoint at the given address. 
Watch Functions 

Watch functions display a message whenever the watchpoint criteria is met. These do not halt 
the execution of the program. 
1. Setting watchpoints 
The command 



W address 
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displays a message where address is a valid address or range specifiers. 

2. Setting watch expressions 

The command 
WE expression 

displays a message when expression is true. 

3. Listing watchpoints and watch expressions 

The command 

W 

displays a listing of the watchpoints and watch expresions. 

4. Deleting watchpoints and watch expressions 

The command 
WD watchnumber 

deletes a watch point or watch expression with the corresponding watchnumber. 

Table 4-2 defines the control key sequences used only in particular windows. Table 4-3 lists a 
brief summary some useful commands and their arguments. A complete list of these commands 
is available in the Cross Software Manual [2]. 



Keystrokes 


i 1 

Description 


A B 


set breakpoint at cursor location in program memory window 


A R 


reset (delete) breakpoint at cursor location in program memory window 


A D 


delete field in active window 


A U 


undelete (restore) field in active window 


A Y 


move field in active window 


A E 


toggle numeric display of active window contents between HEX and 
DEC 


A G 


go to (prompted for address to be displayed) 


A S 


choose how many instructions to trace in trace window 




toggle tracking on/off in active window 


A Q 


quit simulator 



Table 4-2: Window-Specific Control Key Sequences 

4.3.4 Instruction Set Work-Out Using The Simulator 

In Chapter 2, we described the complete instruction set of the ADSP-2101. To obtain a better 
understanding of each instruction, it is necessary to peek inside the processor and observe how 
the instruction works and how several registers are affected. This is possible using the Simulator 
because it allows us to execute individual instructions and then open differen: windows and 
look inside every field. One can also use the Emulator for this purpose but it is not as convenient 
as the Simulator. In this section, we will study some representative instructions from each group 
with emphasis on their effects on contents of various hardware elements. This will help us to 
understand the capabilities and limitations of those instructions. Some instructions like those 
in the Program Flow Control group are not possible to study using this approach because they 
require branching to another address location. 
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Keystrokes 


Description 


A addr instr 
B 


assemble instruction at address 

list breakpoints, break expressions, and break change 
expressions 


BD addr or number 


delete breakpoint, break range, break change or break 
expression 


CR 


chip reset with boot page load 


G [addr] 


start program execution 


I int# min max 


Cause periodic interrupt 


J symbol 'command'' 


alias a command string 


J 


list alias 


JD symbol 


deelete alias 


L Tile' 


lo;id nroffram into memorv 


O addr \<'filel \> , filel 


onen T/O nort at addr and assipn I/O file 


addr 


oWkp I/O nort at addr 


P SPORT#r<'ft/<?'H>'r7/e'l 


open serial port# or 1 and assign I/O file 





quit simulator, with verification from user 


R regf expr 


sei register equal to value of expression 


RUNFAST 


start program execution, no halt on key hit 


S [number] 


single step program execution 


V msrr 


assemble and execute instruction 


W aoWr or range 


sel watch point 


WD number 


delete watch point# 


Y [>'/// e '][<'/// e '] 


Z [>'/F/^'][<'/?/ e '] 


save/restore Simulator state 




ev iluate expression 



Table 4-3: Command Window Commands (Brief List) 

Arithmetic on the ADSP-2101 

To better understand the detailed discussion in the work-out, the user should first understand 
how the ADSP-2101 handles binary arithmetic. The ADSP-2101 is 16-bit, fixed-point machine. 
Special features support multiword arithmetic and block floating point. Most operations assume 
a twos-complement number while others assume an unsigned number or a simple binary string. 
In this section, we discuss the arithmetic used by each computational unit or operation. 

Binary String 

This is the simplest binary notation; sixteen bits are treated as a bit pattern. Examples of 
computation using this format are the logical operations: NOT, AND, OR and XOR. These 
ALU operations treat their operands as binary strings with no provision for sign bit or binary 
point placement. 
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Unsigned Numbers 

Unsigned binary numbers may be thought of as positive, having nearly twice the magnitude of 
a signed number of the same length. The least significant words of multiple precision numbers 
are treated as unsigned numbers. 

Signed Numbers: Twos-Complement 

In discussions of ADSP-2101 arithmetic, "signed" refers to twos-complement:. Most ADSP- 
2101 operations presume or support twos-complement arithmetic. The ADSP-2101 does not 
use signed-magnitude, ones-complement, BCD or excess-n formats. 

Fractional Representation: 1.15 

The ADSP-2101 is optimized for arithmetic values in a fractional binary format denoted by 
1.15 ("one dot fifteen"). (Referred to in some contexts as 16.15 or Q15.) This is a fixed-point 
format. Used with the most significant bit (MSB) as a sign bit, the 1 . 15 means one sign bit and 
fifteen fractional bits representing values from - 1 up to one least significant bit (LSB) less than 
+1. 

ALU Arithmetic 

All operations on the ALU treat operands and results as simple 1 6-bit binary strings, except the 
signed division primitive (DIVS). Various status bits treat the results as signed: the overflow 
(AV) condition code, and the negative (AN) flag. 

The logic of the overflow bit (AV) is based on twos-complement. It is set if the MSB 
changes in a manner not predicted by the signs of the operands and the nature of the operation. 
For example, adding two positive numbers must generate a positive result; a change in sign bit 
signifies an overflow and sets AV. Adding a negative and a positive may result in either a 
negative or positive result, but cannot overflow. 

The logic of the carry bit (AC) is based on unsigned-magnitude. It is set if a carry is 
generated from bit 16 (the MSB). The AC bit is most useful for the lower word portions of a 
multiword operation. 

MAC Arithmetic 

The multiplier produces results that are binary strings. The inputs are interpreted according to 
the information given in the instruction itself (signed times signed, unsigned times unsigned, a 
mixture or round). The 32-bit result from the multiplier is assumed to be signed, in that it is 
sign-extended across the full 40-bit width of the MR register set. 

The ADSP-2101 supports two modes of format adjustment: the fractional mode for 
fractional operands, 1.15 format (1 signed bit, 15 fractional bits), and the integer mode for 
integer operands, 1 6.0 format. When multiplying 1.15 operands, the result is 2.30 (30 fractional 
bits). To correct this, in the fractional mode, a left shift occurs between the multiplier product 
(P) and the multiplier result register (MR). This shift (1 bit to the left) causes the multiplier 
result to be 1.31 which can be rounded to 1 . 1 5. In the integer mode, the left shift does not occur. 
For example, if the operands are in the 16.0 format, the 32-bit multiplier result would be in 32.0 
format. A left shift would change the numerical representation resulting in an incorrect value. 
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Shifter Arithmetic 

Most operations in the Shifter are explicitly geared to signed (twos-complement) or unsigned 
values: Logical shifts assume unsigned-magnitude or binary string values and Arithmetic Shifts 
assume twos-complement. The exponent logic assumes twos-complement numbers. It supports 
block floating point, which is also based on twos-complement fractions. 

We will now study instructions from various groups using the Simulator. The most useful 
simulator command that we will use is the "V" command which performs line assembly and 
execution of an instruction. Also we will need the Register window which provides a display 
of all general purpose registers. To follow the exercises in this section, make sure that the 
Simulator is invoked properly and that the Command window is the active window. (In the 
following, we will use the Command Window prompt V to indicate a command line.) 

ALU Group 

This group contains arithmetic and logical operations which are easy to understand. First, open 
a Register window and Flag window, and then execute the following instructions. 

> V AXO = 3; 

> V AR =~AX0 + AYO; 

The first instruction puts decimal 3 in AXO and the second one puts decimal 5 in AYO. Observe 
the register window as these instructions are executed. The default mode for the •egister contents 
is hex. Use A E to toggle between hex and decimal mode. The third instruction adds AXO and 
AYO and puts the result in the AR register. The AR register now changes to 8 and all flags are 
cleared to 0. Now execute the following instruction which performs subtraction. 

> V AF = AXO - AYO; 

The register AF changes to OxFFFE (hex) or -2 (decimal). AN is set to 1 since result is a 
negative value and hence ASTAT becomes 0x02. 

Now let us try some instructions which produce arithmetic overflow. 

> V AXl = 0X7FFF; 

> V AY1 = 0X7FFF; 

> V AR = AXl + AY1 ; 

AXl and AY1 are set to 32767. AR register after addition changes to OxFFFE which is 65534 
but is interpreted as -2 in 2'complement arithmetic. This signals overflow and triggers the AN 
flag. Therefore AV and AN are set to 1 which changes ASTAT to 0x06. 

> V AXO = 0x9000; 

> V AYO = 0XA000; 

> V AR = AXO + AYO; 

AXO is set to -28672 and AYO is set to -24576. Since 0x9000 + Ox A000 = Ox 1 3C 00, AR changes 
to 0x3000 or 1 2288 but overflow a id carry flags are triggered. A V and AC are set to 1 but AN 
changes to (why?). Therefore ASTAT becomes OxOC. 

> V AXO = 0; 

> V AYO = 0; 

> V AR = AXO + AYO + C; 
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The last instruction adds AXO and AYO along with carry bit C. Since it was sel by the last add 
instruction, AR is equal to 1 and all flags are cleared. ASTAT is also cleared to 0. 

Try executing ALU instructions given in Chapter 2 and exercise the different ALU flags. 

MAC Group 

This group implements multiply and multiply/add operations. The important thing to learn here 
is how this multiplication is performed. The M_Mode bit (bit 4 in MSTAT register) controls 
whether the multiplication is in the default fractional format or in interger foimat. With the 
Register and Flag windows open, execute the following MAC instructions. 

> V MXO = 0X4000; 

> V MY0 = 0X4000; 

> V MR = MXO * MY0 (ss) ; 

The MAC assumes data in a 1.15 fractional format. The first two instructions therefore put 0.5 
in MXO and MY0 registers. Now multiplication of two 1.15 format numbers is a number in 
2.30 format. However, MAC changes this format by an automatic 1 -bit left shift in the default 
mode and the result is a number in 1 .3 1 format. The third instruction multiplies MXO and MY0 
(where both are assumed to be in signed form) and puts the result in MR. MR is a 40 bit register 
which is displayed as two 16-bit MR0 and MR1 registers and a 8-bit MR2 register. The result 
of the above operation is illustrated below. No flags are set in this operation. 

MXO = o.ioo oooo oooo oooo 

* MY0 = 0.100 0000 0000 0000 

MXO * MYO = 0.001 oooo oooo oooo oooo oooo oooo oooo 
After 1-bit shift to the left 

mxo * myo = o.oio oooo oooo oooo oooo oooo oooo oooo (0.25 decimal) 

=> MR2 = 0x00 
=> MR1 = 0x2000 
=> MRO = 0x0000 

Let us now try the following instructions. 

> V MXl = 2; 

> V MY1 = 3; 

> V MR = MXl * MY1 (SS) ; 

Note that decimals 2 and 3 are put in MXl and MY1 respectively. The result however is not 
the expected integer 6 but 12 because of the default fractional format as explained below. Also 
note that the result is interpreted not as an integer 12 but some fractional number (which one?). 

MXl = 0.000 0000 0000 0010 

* MY1 = 0.000 0000 0000 0011 

MXl * MY1 = 0.000 0000 0000 0000 0000 0000 0000 0110 

After 1-bit shift to the left 

MXl * MY1 = 0.000 0000 0000 0000 0000 0000 0000 1100 

=> MR2 = 0x00 
=> MR1 = 0x0000 
=> MRO = OxOOOC 
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Now we will turn on the integer multiplication mode. This is done by enabling M_MODE 
bit. Execute the following instructions. 

> V ENA M_MODE; 

> V MR = MX1 * MY1 (SS) ; 

M_MODE bit (bit 4 of the MSTAT register) is set after execution of the ENA (Enable) 
instruction. It forces subsequent MAC instructions to suppress automatic left shi ft of the product 
and implements integer product. Multiplication of MX1 and MY1 now result; in 6. 

MXl = 0.000 0000 0000 0010 

* MY1 = 0.000 0000 0000 0011 

MXl * MY1 = 0.000 0000 0000 0000 0000 0000 0000 0110 

=> MR2 = 0x00 
=> MR1 = 0x0000 
=> MR2 = 0x0006 

Let us disable the integer mode and study some other aspects of MAC instructions. Try 
the following. 

> V DIS MJMODE; 

> V MR1 = 0X7FFF; 

> V MR0 = OXFFFF; 

> V MR = MR + MXO * MYO (ss) j 

MR is set to the highest positive number in 1.31 format which is = 1 . Therefore any addition 
to it will cause an overflow. The k st instruction is a multiply /add instruction which multiplies 
MXO (0x4000) with MYO (0x4000), adds the result to MR and stores the new value in MR. 

MXO = 0.100 0000 0000 0000 

* MYO = 0.100 0000 0000 0000 



mxo * myo = o.oio oooo oooo oooo oooo oooo oooo oooo (after 1-bit shift) 

+ MR = 0.111 1111 1111 1111 1111 1111 1111 1111 

MR = 1.001 1111 1111 1111 1111 1111 1111 1111 

=> MR2 = 0x00 
=> MR1 = 0x9FFF 
=> MR0 = OxFFFF 

The multiply overflow bit MV is set to 1 and ASTAT changes to 0x40. This overflow bit can 
be used to take some action. Exec Jte the following control flow command. 

> V if MV SAT MR; 

Since MV is set, MR will be saturated and set to the maximum possible fraction. 

MR2 = 0x00 
MRl = 0x7FFF 
MR2 = OxFFFF 

Try executing some more MAC instructions and learn about different formats and the 
saturation mode. 
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Shifter Group 

This group performs arithmetic/logic shifts and operations pertaining to floting-point arithmetic. 
These instructions are fairly easy to understand. With the Register and Flag windows open, 
execute the following instructions. 

> v si = 0x1111; 

> V SR = LSHIFT SI by 2 (HI) ; 

The first instruction puts hex 1111 in register SI. The second instruction, LSHIFT, is a logical 
shift (not left shift) instruction in which 32-bit output field SR is zero-filled from right for left 
shift and zero-filled from left for right shift. The above instruction shifts SI logically to the left 
by 2 bits (multiply the number by 2 +2 ), hence SI = 0x4444. The result is stored in the high 16 
bits of the SR register, i.e., SRI = 0x4444 and SRO = 0x0000 

> V SR = LSHIFT SI by -2 (HI) ; 

Now SI is shifted logically 2 bits to the right (multiply the number by 2" 2 ) and the result is stored 
in high registers of SR, hence SRI = 0x0444 and SRO = 0x4000 

> V SR = SR or LSHIFT SI by 1 (LO) ; 

When SI is logically shifted to left by 1 bit, the result is 0x2222. After the OR operation, SR 
becomes 

SR = 0x0444 4000 
OR 0x0000 2222 

SR = 0x0444 6222 

=> SRI = 0x0444 
=> SRO = 0x6222 

Next, we will consider an arithmetic shift in which a 32-bit output field SR is zero-filled 
from right for left shift and sign-extended to the left for right shift. 

> v SI = OxCOOO; 

> V SR = ASHIFT SI by -8 (HI) ; 

In the first instruction SI is set to OxCOOO. The second instruction shifts SI arithmetically 8 bits 
to the right and the result is stored in high 16 bits of SR register. The result is OxFFCO. 

SI = 1100 0000 0000 0000 
ASHIFT -8 1111 1111 1100 0000 

=> SRI = OxFFCO 
=> SRO = 0x0000 

The Shifter also performs exponent extraction and normalization operatians needed in 
floating-point arithmetic. Consider the following instructions. 

> V SI = OxOFFF; 

> V SE = EXP SI (HI) ; 
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The first instruction sets SI to OxOFFF. In 1 . 15 format, SI = 0.000 111111111111. There are 
3 zeroes to the right of the binary point. The second instruction extracts the exponent for 
normalization. Therefore, SE = -3 (decimal) or OxFD. Note that SE is set to the negative of 
the exponent needed for normalization. In the following instruction 

> V SR = NORM SI (HI) ; 

since SE = -3, the content of SI an: logically shifted to the left 3 bits and the result is stored in 
the high 1 6 bits of the SR register, hence 
SRI = 0x7FF8 and SR0 = 0x0000 

Try executing some more shifter instructions given in Chapter 2. 

DAG Group 

This set of instructions perform data move operations by generating appropriate addresses. The 
important thing to learn here is how addresses are generated by I register and modified by M 
and L registers. In this section we will move values into data memory. Therefore open a Register 
as well as a Data Memory window and execute the following instructions. 

> v 10 = 4; 

> V Ml = 3; 



> V L0 = 8; 

> E DM[0]/8 



1234; 



The first three instructions move values into 10, Ml and L0 registers. The last command "E" 
0x1234 into 8 data memory locations beginning at 0. Threfore the contents of dm[0] 



through dm[7] are equal to 0x1234. Now execute the following data memory read instruction. 

> V AX0 = DM(I0, Ml) ; 

The index register 10 is equal to 4 before the execution, therefore DM[4] (= 0x1234) is read 
into AX0. After memory read, 10 = (10 + Ml) mod L0. Hence 10 becomes (3 + 4) mod 8 = 7. 

> V AY0 = DM(I0, 




Now AY0 = DM[7] = 0x1234, while 10 changes to (7 + 3) mod 8 = 2. Threfore the next time 
data memory is read, 10 will point towards memory location 2. This is called c ircular buffering 
and it is an important operation in digital signal processing. It should be noted that the buffer 
length register L# must go with tile index register I#, while the modify register M% can be 
anyone from the set. 

Data Address Generator 1 also has a capability to provide bit-reverse lcgic intended for 
use in fast Fourier transform computations where inputs are supplied or outputs generated in 
bit-reversed order. The pivot point for the reversal is the midpoint of the 14-bit c.ddress, between 
bits 6 and 7. This is illustrated in the following chart. 



Individual DMA lines (DMA N ) 

06 05 04 03 02 01' 00 

07 08 09 10 11 12 13 



Normal order 13 12 11 10 09 08 07 
Bit-reversed 00 01 02 03 04 05 06 
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Bit-reversed addressing mode is enabled or disabled by setting bit 1 in the mode status register 
(MSTAT). Try the following instructions. 

>V ENA BIT_REV; 
>V SI = DM(I0,M1) ; 

The first instruction enables bit reversal. When enabled, all addresses generated using index 
registers 10-3 are bit-reversed upon output. Since 10 = 2, the generated address is 4096 (why?). 
Hence DM[4096] is read into SI. However, after memory read, 10 changes to (2 + 3) mod 8 = 
5. In other words, the modified value stored back after post-update remains in normal order. 

Try executing some more DAG instructions given in Chapter 2. 

4.4 IN-CIRCUIT EMULATOR (EZ-ICE) 

The "EZ-ICE" In-Circuit Emulator provides a testing and debugging environment to check 
digital signal processing application programs through a menu-driven, easy-to-use interface. 
Even though the Simulator can support debugging facilities, real-time operational testing can 
only be done on an Emulator. It is an essential tool in an industrial setting. In an educational 
(laboratory) environment in which the emphasis is on the basics of the digital signal processor 
programming, the Emulator is useful only as a final link in the development. This section 
therefore deals with some basic operations of the EZ-ICE system and describes Ihe Command 
Menu structure. Detailed information can be obtained in the EZ-ICE manual [3J. 

The target system for the EZ-ICE (also referred to as the Probe) should also be available 
before programs can be tested. We assume that such a target system is the EZ-LAB system 
described in Chapter I. To communicate with the EZ-ICE through the PC, a terminal emulator 
program is required. For the purpose of discussion we also assume that a program called 
PROCOMM is available. However this is not essential and any communication program capable 
of emulating a VT100 type terminal can be used. 

4.4. 1 Hardware Operations of EZ-ICE 

Consult your lab supervisor for power supply and PC connections to EZ-ICE and EZ-LAB 
boards. General Power up and power down sequences are described below. 

Power Up Sequences 

• Start terminal emulation software ( e.g. Procomm ) on your PC. 

• If separate power supplies are used for EZ-LAB and EZ-ICE, then 
-Turn on power supply for EZ-ICE, 

-Press [ Enter*-" ] on your PC, 

-When advised to turn on power of the target system( e.g. EZ-LAB ), do so. 

• If a single power supply is used for both EZ-ICE and EZ-LAB, then 
-Turn on the power supply, 

-Press ( Enters ) on your PC, 
-Press another [E nters ] to continue. 
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The above procedures are important in order to safeguard the EZ-LAB and EZ-ICE cir- 




Down Sequences 

• If separate power supplies are used for EZ-LAB and EZ-ICE, remove power 
EZ-LAB FIRST before removing power from EZ-ICE. 

• If a single power supply is used, turn it off when you are done. 

4.4.2 Invoking Emulator Firmware 

The Emulator has its own softwa -e stored on a ROM which contains diagnostic tests and its 
menu system. The power up sequence resets the Emulator or pressing the RESET switch on 
the board produces the same result. 

Initial Display 

After EZ-ICE is reset, press the PC's (Enters) key to obtain the following initial display. 

"EZ-ICE" ADSP-2101 IN-CIRCUIT EMULATOR - Analog Devices, Inc. Version 

RS-232 COMMUNICATIONS ESATB LISHED AT 9600 BAUD. 

TURN ON POWER FOR TARGET SYSTEM NOW. 
— Hit Carriage Return To Continue — 

> SELF TEST IN PROGRESS! 

1) HOST RAM TEST 
HOST RAM OK! 

2) ADSP-2101 FUNCTION TEST 
ADSP-2101 OK! 

** MMAP Configuration Is Set For Internal Program Memory At Location 0000 
** 

Do You Wish To Use Overlay Memory (Y or N) ? 
♦♦IMPORTANT** 
Target Memory MUST Be Removed From Your Circuit If You Use Overlay Memory! 

If the answer is 'N\ the EZ-ICE menu will appear. If the answer is 'Y\ the software will 
invoke overlay memory test. 

Overlay Memory Test 

If the answer to the initial display is 'Y\ overlay memory test will be invoked. The overlay 
memory test screen should look as follows: 

3) OVERLAY MEMORY TEST 
♦♦IMPORTANT '* 

Target Memory MUST Be Removed From Your Circuit For This Test ! 
This Test Should Only Be Run If You Intend To Use Overlay Memory, 
Since Overlay Memory I:; Enabled And Left Enabled After The Test . 

(See Manual For Jumper Configuration) 
Type TEST To Proceed W:.th Test, Hit Any Other Key To Omit Test! 
> 
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If you type anything other than "test", overlay memory test will be skipped. Otherwise, 
the following question will be displayed on the screen. 

What Is The Position Of Jumper JP1 (1, 2 or 3)? 

Type in the Position of Jumper JP1 and the overlay memory test begins. If the overlay 
memory test fails, verify that the jumper position you entered and the actual jumper position 
are the same. Also, make sure the target memory that the overlay memory replaces has been 
removed. 

4.4.3 EZ-ICE Basic Command Menu 

All emulator operations originate either from the keyboard input at this level, or from one of 
the subordinate displays or menus reached through it. The Basic Command Menu looks as 
follows: 

| "E Z - I C E" ADSP-2101 EMULATOR I 

I Basic Command Menu j 

Read/Write Registers 

j Read Data Memory | 

| Write Data Memory j 

I Read Program Memory | 

j Write Program Memory j 

j Reset Processor j 

j Single Step | 

j Set, Clear, List Breakpoints | 

I Run Until Breakpoint | 

I Boot From External EPROM j 

j Download From Host j 

I Upload To Host I 

I Display Stacks | 



You can control the menu selection by positioning the cursor on the item desired using 
the Up and Down Arrow keys and pressing Entef<^. 

Read/Write Registers 

This is the first entry on the Basic Commands Menu. You can view and modify the contents 
of processor registers manually. The Read/Write Registers display is shown as follows: 



| "EZ-ICE" ADSP-2101 Register Display (Values Are Displayed in HEX 



AN 
AV 
AZ 



I ALU 
|AXO 0000 

|AX1 0000 AR 0000 

|AY0 0000 AF 0000 

|AY1 0000 

j Multiplier-Accumulator 
|MX0 0000 

|MX1 0000 MR2 0000 MR1 0000 
|MY0 0000 MF 0000 

|MY1 0000 I 
I Shifter 
| SI 0000 SE 0000 SRI 0000 
| SB 0000 ! 

I 

| CNTR 0000 Astat 0000 Mstat 0000 
| PC 0000 RX0 006A 

I Single Step 

I Display Control Registers 

| Return To Basic Command Menu 



AC 
AQ 
AS 



SR0 0000 
S 



Sstat 0055 
TX0 0055 



Address Generator 

10 0000 M0 0000 L0 

11 0000 Ml 0000 LI 

12 0000 M2 0000 L2 

13 0000 M3 0000 L3 

Address Generator 

14 0000 M4 0000 L4 

15 3000 M5 0000 L5 

16 3000 M6 0000 L6 

17 07FF M7 0000 L7 

PX OOCF 



) I 

# 1 I 
0000 I 
(1000 I 
00001 
(10001 

I 

# 2 I 
0000 1 
0000 I 
00001 
00001 

I 
I 



0000 Icntl 0017 
RX1 006A TX1 0055 



** Place Cursor Over First Digit Of Data Field For Data Entry ** 
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The keyboard keys you may use in 

Up Arrow 
Down Arrow 
Left Arrow 
Right Arrow 
Tab 
A A 

There are three options associated with this display: Single Step, Display Control Register 
and Return to Basic Command Menu. 

Single Step 

This option allows execution of instructions while monitoring the resulting states of the registers. 
It is similar to the Simulators single stepping of instructions. 

Display Control Registers 

This option will display the contents of the 17 memory-mapped Control Registers found from 
0x3FEF to 0x3FFF in data memory. We can view and modify the contents of all the control 
registers. This display is shown as follows: 



ADSP-2101 


Control Register Display (Values Displayed In HEX) 


System Control Register 




001F 


Data Memory Wait State Control Register 




7FFF 


TPERIOD Period Register 




0000 


TCOUNT 


Counter Register 




0000 


T SCALE 


Scaling Register 




0000 


SPORT0 


Multichannel RCV Word Enable Reg 


(MS) 


80FD 


SPORT0 


Multichannel RCV Word Enable Reg 


(LS) 


0000 


SPORT0 


Multichannel TX Word Enable Reg. 


(MS) 


0080 


SPORT0 


Multichannel TX Word Enable Reg. 


(LS) 


FF7F 


SPORT0 


Control Register 




0000 


SPORT0 


SCLKDIV Serial Clock Divide Modulus 


0000 


SPORT0 


RFSDIV Receive Frame Sync Divide 


Modulus 


0000 


SPORT0 


Autobuf f e r Control Register 




0000 


SPORT1 


Control Register 




0000 


SPORT1 


SCLKDIV Serial Clock Divide Modulus 


0000 


SPORT1 


RFSDIV Receive Frame Sync Divide 


Modulus 


0000 


SPORT1 


Autobuffer Control Register 




0000 




Return To Data Register Display 







** Place Cursor Over First Digit Of Data Field For Data Entry ** 



Return to Basic Command Menu 

Pressing ( Enters ) on this selection will display Basic Command Menu. 

Read Data Memory 

This entry allows us to view the 1 6-bit contents of the data memory location plus ten locations 
above and below the address. For example, to read data memory at hex location 3810, select 
the Read Data Memory entry and enter the location as shown below. 



this display are: 



Move up one line 
Move down one line 
Move left one column 
Move right one column 
Move 10 positions to right on a line 
( AxO ) 
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"E Z - I C E" ADSP-2101 EMULATOR 
Basic Command Menu 

Read/Write Registers 
Read Data Memory 
Write Data Memory 
Read Program Memory 
Write Program Memory 
Reset Processor 
Single Step 

Set, Clear, List Breakpoints 
Run Until Breakpoint 
Boot From External EPROM 
Download From Host 
Upload To Host 
Display Stacks 



ENTER HEX MEMORY ADDRESS: 3810 



After specifing the address 3810, the corresponding data memory locations are displayed 
as follows: 



HEX Addr. 
3806 
3807 
3808 
3809 
380A 
380B 
380C 
380D 
380E 
380F 
> 3810 
3811 
3812 
3813 
3814 
3815 
3816 
3817 
3818 
3819 
381A 



DM Value (HEX) 
400C 
0000 
0000 
0100 
0000 
0804 
8300 
0000 
0000 
0000 
8001 
4080 
0004 
0000 
0000 
0000 
1001 
0000 
0100 
0004 
0000 



** Hit Carriage Return To 



turn To Basic Command Menu ** 



Using arrow keys, adjacent data memory locations can also viewed. 

Write Data Memory 

We can write a value to one or more data memory addresses by selecting the Write Data Memory 
entry in the Basic Command Menu. For example, to write hex 1 234 in 1 6 data memory locations 
beginning at hex 3800, enter the values as shown below. 




"E Z - I 



C R" ADSP-2101 EMULATOR 
Basic Command Menu 



Read/Write Registers 
Read Data Memory 
Write Data Memory 
Read Progrsim Memory 
Write Program Memory 
Reset Processor 
Single Step 

Set, Clear, List Breakpoints 
Run Until Breakpoint 
Boot From External EPROM 
Download From Host 
Upload To Host 
Display Stacks 



ENTER HEX MEMORY ADDRESS: 3800 
ENTER HEX DATA VALUE : 1234 
ENTER HEX NUMBER OF LOCATIONS: 10 



mory locations will be disp 



HEX Addr. 


DM Value (HEX) 


37F6 


FFFF 


37F7 


FFFF 


37F8 


FFFF 


37F9 


FFFF 


37FA 


FFFF 


37FB 


FFFF 


37FC 


FFFF 


37FD 


FFFF 


37FE 


FFFF 


37FF 


FFFF 


> 3800 


1234 


3801 


1234 


3802 


1234 


3803 


1234 


3804 


1234 


3805 


1234 


3806 


1234 


3807 


1234 


3808 


1234 


3809 


1234 


380A 


1234 



** Hit Carriage Return To Return To Basic Command Menu ** 

Read Program Memory 

The Read Program Memory command on the Basic Command Menu allows us to view the 
24-bit contents of the program rr emory location selected, plus ten locations above and below 
the address. To read program memory location at hex 20, select the Read Program Memory 
entry as shown below. 
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========================================== 

"E Z - I C E" ADSP-2101 EMULATOR | 
Basic Command Menu j 



Read/Write Registers 
Read Data Memory 
Write Data Memory 
Read Program Memory 
Write Program Memory 
Reset Processor 
Single Step 

Set, Clear, List Breakpoints 
Run Until Breakpoint 
Boot From External EPROM 
Download From Host 
Upload To Host 
Display Stacks 



ENTER HEX MEMORY ADDRESS: 20 

Then, the program memory address, plus ten location above and below it, the Program 
Counter (PC) and any pre-defined breakpoints (BK) will be displayed as follows. 



HEX Addr. PM 


Value (HEX) 


Disassembled 


Contents 


0016 


0A001F 


RTI 




0017 


0A001F 


RTI 




0018 


0A001F 


RTI 




0019 


0A001F 


RTI 




001A 


0A001F 


RTI 




001B 


0A001F 


RTI 




001C 


1C097F 


CALL 0x0097 




<PC> 00 ID 


378000 


10=0x3800 




001E 


340014 


M0=0x0001 




001F 


341008 


L0=0x0100 




> 0020 


379052 


12=0x3905 




0021 


340006 


M2=0x0000 




0022 


34000A 


L2=0x0000 




0023 


381000 


14=0x0100 




0024 


380014 


M4=0x0001 




0025 


381008 


L4=0x0100 




0026 


382001 


15=0x0200 




0027 


381009 


L5=0x0100 




0028 


383002 


16=0x0300 




0029 


38100A 


L6=0x0100 




002A <BK> 


384003 


17=0x0400 





** Hit Carriage Return To Return To Basic Command Kenu ** 

Write Program Memory 

The Write Program Memory entry allows us to modify the contents of any existing RAM location 
in program memory with a six-digit hexadecimal value. To change the instruction of a program 
memory location select the Write Program Memory entry in the Basic Command Menu. The 
EZ-ICE firmware will request for the hexadecimal location of the program memory we want 
to modify. Then, a six-digit instruction encoding number is asked for input. Finally, we will be 
prompted for the number of program memory addresses we want filled with this value. 
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"E Z 



ICE' ADSP-2101 EMULATOR 
Basic Command Menu 



Read/Write Registers 
Read Data Momory 
Write Data Memory 
Read Program Memory 
Write Program Memory 
Reset Processor 
Single Step 

Set, Clear, List Breakpoints 
Run Until Breakpoint 
Boot From External EPROM 
Download From Host 
Upload To Host 
Display Stacks 



I 



ENTER HEX MEMORY ADDRESS : A 
ENTER HEX DATA VALUE: 1800C0 
ENTER HEX NUMBER OF LOCATIONS : 4 



As soon as you press [ Enters j on the last entry of the Write Program Mem Dry, the modified 
program memory will be displayed as follows. 













HEX Addr. 


PM Value (HEX) 


Disassembled Contents 


<PC> 0000 


1801CF 


JUMP 


OxOOlC 


0001 


0A001F 


RTI 






0002 


0A001F 


RTI 






0003 


0A001F 


RTI 






0004 


1807AF 


JUMP 


0x007A 


0005 


0A001F 


RTI 






0006 


0A001F 


RTI 






0007 


0A001F 


RTI 






0008 


0A001F 


RTI 






0009 


0A001F 


RTI 






> 000A 


1800C0 


IF EQ 


JUMP 


OxOOOC 


000B 


1800CO 


IF EQ 


JUMP 


OxOOOC 


000C 


1800C0 


IF EQ 


JUMP 


OxOOOC 


000D 


1800C0 


IF EQ 


JUMP 


OxOOOC 


000E 


0A001F 


RTI 






000F 


0A001F 


RTI 






0010 


0A001F 


RTI 






0011 


0A001F 


RTI 






0012 


0A001F 


RTI 






0013 


0A001F 


RTI 






0014 


OA001F 


RTI 







** Hit Carriage Return To Return To Basic Commard Menu ** 



Reset Processor 

The Emulator can be re-initialized upon selection of this entry. Upon completion of the reset, 
the "Processor Reset Complete!" message is displayed briefly. 
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"E Z - I C E" ADSP-2101 EMULATOR 
Basic Command Menu 

Read/Write Registers 
Read Data Memory 
Write Data Memory 
Read Program Memory 
Write Program Memory 
Reset Processor 
Single Step 

Set, Clear, List Breakpoints 
Run Until Breakpoint 
Boot From External EPROM 
Download From Host 
Upload To Host 
Display Stacks 



Processor Reset Complete ! 

Single Step 

We can perform a single step operation while viewing the registers in the Read/Write Registers 
display. We can also do this from the Basic Command Menu by selecting the Single Step entry: 



EZ-ICE" ADSP-2101 EMULATOR 
Basic Command Menu 

Read/Write Registers 
Read Data Memory 
Write Data Memory 
Read Program Memory 
Write Program Memory 
Reset Processor 
Single Step 

Set, Clear, List Breakpoints 
Run Until Breakpoint 
Boot From External EPROM 
Download From Host 
Upload To Host 
Display Stacks 



Addr: 0000 Instr: JUMP OxOOlC 

Set, Clear, List Breakpoints 

EZ-ICE is capable of supporting as many as 16 addressed breakpoints in program memory. 
Breakpoints can be set, cleared and/or listed from the Breakpoint Display which is reached from 
the Basic Command Menu by selecting the Set, Clear, List Breakpoints entry. 

The presence of a breakpoint is shown as "<BK>" in the Read Program Memory display. 
A snapshot of the Breakpoint Display is shown below. 
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Breakpoint Number breakpoint Address (HEX) 



00FC 

1 0180 

2 0232 
3 

4 Enter Breakpoint Address 

5 At Cursor Position 
6 

"l — 
° - 
9 
A 



D 
E 
F 



** To Delete A Breakpoint, Type X Followed By The Breakpoint Number ** 
** Hit Carriage Return To Return To Basic Command Kenu ** 



To add a breakpoint, type in a four-digit hexadecimal address followed by (Enter). To 
remove a breakpoint, type in "x#" v/here "#" is the number of the breakpoint you want to remove. 



Run Until Breakpoint 

The Run Until Breakpoint entry m Basic Command Menu causes the Emulator to execute 
instructions from the current program counter address until the next breakpoint is encountered. 
However, the execution of instructions can be stopped by pressing any key on the keyboard. A 
snapshot of the Processor Running Screen is shown below. 



==:— 



"E z - I 



C E* ADSP-2101 EMULATOR 

Basiic Command Menu 



Read/Write Registers 
Read Data Memory 
Write Data Memory 
Read Program Memory 
Write Progrsim Memory 
Reset Processor 
Single Step 

Set, Clear, List Breakpoints 
Run Until Breakpoint 
Boot From External EPROM 
Download From Host 
Upload To Host 
Display Stacks 



I 
I 
I 



PROCESSOR RUNNING! 
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Boot From External EPROM 

EZ-ICE has the capability of transferring any of up to 8 pages of external boot EPROM in the 
target system into the 2K, 24-bit words of program memory RAM within the Emulator. Booting 
is available only if the MMAP pin is pulled to low ( i.e. MMAP = ). An error message will 
result upon selection of this entry if MMAP = 1 . A snapshot of the Booting Screen is shown 
below. 



"E Z - I C E" ADSP-2101 EMULATOR 
Basic Command Menu 

Read/Write Registers 
Read Data Memory 
Write Data Memory 
Read Program Memory 
Write Program Memory 
Reset Processor 
Single Step 

Set, Clear, List Breakpoints 
Run Until Breakpoint 
Boot From External EPROM 
Download From Host 
Upload To Host 
Display Stacks 



Enter Boot Page Number : 5 
** Boot From External EPROM Page 05 Complete ** 
Hit Carriage Return To Continue 

Download From Host 

Selection of this entry enables downloading of Memory Image Files (.EXE) from the PC to 
EZ-ICE. The following is the Download From Host screen. 



EZ-ICE" ADSP-2101 EMULATOR 
Basic Command Menu 

Read/Write Registers 
Read Data Memory 
Write Data Memory 
Read Program Memory 
Write Program Memory 
Reset Processor 
Single Step 

Set, Clear, List Breakpoints 
Run Until Breakpoint 
Boot From External EPROM 
Download From Host 
Upload To Host 
Display Stacks 



Start Download From Host Now! 
Type CNTL C To Abort 
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If MMAP = 0, any boot kernel within location 0x0000 and 0x07FF will be loaded in the 
internal program memory of the ADSP-2 1 1 . If MMAP = 1 or a boot memory address is located 
on any other page, the download will take place to program memory with the same addresses 
defined in the boot kernel. 

At this point the communications program is configuring the PC as a terminal with the 
other device being the host; any tranfer from the PC is defined as upload. If you are using 
Procomm as the terminal emulation software, press [PgUp] key after selecting this entry. Then, 
select ASCII as the tranfer protocol. Finally, type in the full file name including the .EXE 
extension. Downloading from the PC will begin and a message, "Download In Progress!" , will 
be displayed on the screen. 

Upload To Host 

The contents of both program memory and data memory can be uploaded to the PC a segment 
at a time. Two file formats are gene -ated: the Memory Image File (.EXE) forma : and the Normal 
format. The Memory Image file format has the same format as the format generated by the 
Linker. In other words, the uploaded file can be re-loaded to the emulator later on. The Normal 
format is in regular ASCII format which you can see the contents of the file like a regular text 
file. But you cannot re-load this file format back to the emulator. A sample Upload To Host is 
shown below. 



"E Z - I C E" ADSP-2101 EMULATOR 
Basic Command Menu 

Read/Write Registers 
Read Data Memory 
Write Data Kemory 
Read Progran Memory 
Write Program Memory 
Reset Processor 
Single Step 

Set, Clear, List Breakpoints 
Run Until Breakpoint 
Boot From External EPROM 
Download Frcm Host 
Upload To Host 
Display Stacks 



Type 1 For PM, 2 For DM : 1 

Type 1 For .EXE Foimat, 2 For Normal Format : 1 
Enter Hex Starting Address : 
Enter Hex Number Of Locations To Upload : 800 
Hit Carriage Return To Continue 



If Procomm is used as the terminal emulation software, press (PgDn) key after entering the 
number of locations to upload and before pressing (Enters) again. Then, Procomm provides a 
menu asking for the transfer protocol to use; your response must be ASCII. Procomm next 
request the name and extension o; the file to download. Type in the filename and extension. 
File transfer from EZ-ICE to PC talces place. When the transfer stops press the EiSC key to return 
Procomm to terminal mode. Next press ( Enters ] to return to Basic Command Menu. 
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Display Stacks 

The Display Stacks is the last entry on the Basic Command Menu. It allows us to view the 
Emulator's program counter, status, and count stacks as well as the current down counter (CNTR) 
value. A snapshot of the Display Stacks screen is shown below. 



CNTR 
0006 



PC Stack 



Status Stack 



Top> 



0104 
0526 
1A3C 



Count Stack 

0400 
0800 



** Hit Carriage Return To Return To Basic Command Menu ** 

Note that since any of these values cannot be changed from within this display, pressing 
[Enter] will return to the top of the Basic Command Mneu. 

4.5 SUMMARY 

In this chapter, we provided a tutorial approach to learning ADSP-2 1 1 ' s software and hardware 
development tools. After following all the examples and completing exercises contained in this 
chapter, the user should be able to appreciate the inner working of the microcomputer, the effect 
of instructions on various control registers, the use and purpose of the Simulator and the 
Emulator, and the integration of the ADSP-2 101 in its target environment. This chapter is a 
prelude to interesting experiments, projects and applications that will follow in the remaining 
chapters. 



chapter 5 



LABORATORY EXPERIMENTS 
USING THE ADSP-2101 







5. 1 1NTRODUCTION 

The first four chapters furnished sc me insight into several hardware and software aspects of the 
microcomputer. In this and subsequent chapters we provide laboratory experiments and projects 
in digital signal processing using the ADSP-2101 microcomputer. We assune that students 
(and readers) are either familiar with or concurrently learning the principles of digital signal 
processing. Therefore, we begin in this chapter with introductory experimenis and programs. 
The remaining chapters deal with nore advanced programs and projects. 

This chapter is devoted to experiments which incorporate all basic operations done in 
DSP. They range in intricacy from the simplest sampling operation to the most elaborate 
waveform generators. In each experiment we introduce a new processing step as well as a new 
programming concept. This, we believe, will enhance the learning ability of students by con- 
centrating on a few simple aspects at a time and by taking time to analyze ;he results more 
thoroughly. In this respect we will treat the ADSP-2101 microcomputer as an instrument which 
is to be learned and used effective y for understanding DSP principles and algorithms. 



There are several experiments in this chapter which are organized into four sections. In 
Section 5.2, we begin with A/D and D/A operation and discuss the effect of aliasing. This basic 
operation is described using the TALKTHRU program and, since it is the first full program, it 
is explained in detail. The elementary DSP operations, i.e., shift, scale, and add are introduced 
in Section 5.3. These operations an; used to simulate effects of acoustic delay and echo. Section 
5.4 explains the implementation of difference equations which are necessary in any filtering 
operation. Experiments on convok tion and recursive filtering are based on this implementation. 
Finally, Section 5.5 describes how to implement transcendental functions and random number 
generators on the ADSP-2 101. These generators along with the Timer unit of the microcomputer 
are used in experiments to generate and display waveforms. All these concepts are described 
using simple programs, while in experiments students are asked to write more elaborate pro- 
grams. The program listings (.DSP files) given in this chapter and their executa^les (.EXE files) 
are on the diskette which is available from Analog Devices, Inc. (see Preface for details). 
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Before beginning the actual work, the user should check to make sure that the EZ-ICE 
and the EZ-LAB are successfully set up, tested and connected to a PC. For audio input/output, 
a microphone and a speaker should be connected to the EZ-LAB according to its manual [4]. 
Similarly for DAC outputs of the EZ-LAB, an oscilloscope should be available. Finally, the 
user should verify that the cross-software is properly installed on the PC and that a communi- 
cation software to emulate VT100 is available. 

The programs described in this chapter use one of the two architecture description files, 
referred to as EZLAB 1 . ACH and EZLAB2. ACH. These files are also available on the diskette. 
In most programs, the internal program and data memory available in the microcomputer is 
sufficient. Hence the default target system is described in EZLAB 1 .ACH file. However in the 
DELAY and ECHO programs described in section 5.3, external memory located on the EZ-ICE 
is used to generate delays up to one second. This configuration is described in EZLAB2.ACH 
file. It is not necessary to use these files and they are given for reference purposes only. In fact 
wherever possible, students should write their own architecture description files to incorporate 
their own defined ports, symbols, etc. 

5.2 A/D AND D/A CONVERSION 

The first step in digital signal processing is sampling and quantization of analog signals or A/D 
conversion. The final step in producing the processed analog signal requires reconstruction or 
D/A conversion. These are perhaps the simplest operations to perform. Therefore in the first 
experiment, we will consider these operations and their effects on analog signals. 

The purpose of this experiment is two-fold. First, in almost all applications, we will need 
code to input and output signal data to and from the processor. Thus the program code from 
this section can be used several times in other experiments, projects, or programs. Second, a 
simple A/D and D/A program, called Talk Through Program can be used to test our hardware 
setup in the beginning of each laboratory session. Although it is not a substitute for a hardware 
diagnostic program, it nevertheless does provide an assurance when we are debugging more 
complex programs. 

5.2. 1 A Talk Through Program 

In this program, an audio signal from a microphone is passed to a speaker through the ADSP-2 1 1 
system which acts as a link (albeit expensive). It is available on the diskette as a TALKTH- 
RU.DSP file. We will explain the workings of this program and also provide details about its 
execution on the Emulator. As Experiment 1 , we will assemble, link and execute variations of 
this program. 

The program TALKTHRU.DSP is shown in Listing 5- 1 . It contains assembler directives, 
helpful comments and instructions. Using comments and instructions (as discussed in Chapters 
2 and 4) it should be possible to understand this program. However, we will discuss it in detail 
below. Subsequent programs will not be described in as much detail except for any new features. 
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{ ADSP-2101 Talk Through Program 



TALKTHRU . DSP 



This program takes an input sample from serial port receive register 
and outputs it to serial port transmit register. It is intended for 
speech signals. The serial clock is internally generated aid 12.288 MHz 
processor rate gives 8 Ilz sampling rate. This program can be used for 
testing the hardware . 

This program is written for EZ-ICE and EZ-LAB system with SZLABl.ACH 
architecture file. Assemble using ASM21.EXE and link using LD21.EXE to 
produce TALKTHRU.EXE. Load TALKTHRU.EXE in EZ-ICE and execute. 



I 

.MODULE/RAM/BOOT=0/ABS=0 TALKTHRU; 

{ Interrupt Vectors 

JUMP start; NOP; NOP; 
RTI; NOP; NOP; NOP; 
RTI; NOP; NOP; NOP; 
JUMP sample; NOP; NOP; 
RTI; NOP; NOP; NOP; 
RTI; NOP; NOP; NOP; 
RTI; NOP; NOP; NOP; 
Initializations 



-} 



NOP; 



{■ 

start : 



AX0=0xl000; 
DM(0x3FFF)=AX0; 
TOGGLE FLAG_OUT 
AX0=0x0000; 
DM(0x3FFE)=AX0 
DM(0x3FFB)=AX0 
DM(0x3FFC)=AX0 
DM(0x3FFD)=AX0 
DM(0x3FE9)=AX0 
DM(0x3FFA)=AXO 
DM(0x3FF7)=AX0 
DM(0x3FF8)=AX0 
AX0=0x6B27; 
DM(0x3FF6)=AX0; 



{Beginning of TALKTHRU Program} 

{ Start Interrupt } 
{External Pin Interrupt IRQ2} 
{ SPORTO Transmit Interrupt } 
{SPORTO Receive Interrupt } 
{SP0RT1 Transmit Interrupt} 
{SP0RT1 Receive Interrupt} 
{TIMER Interrupt} 

I 

{SPORTO enabled, PM Wai-; State 0,} 
{BOOT Wait state 0, BOOT page } 
{Lights FLAG LED } 



} 



0} 



{All DM Wait States 
{TIMER } 
{ not used, } 

{ clearod} 
{Receive } 
{ Multichannels \ 

{Transmit } 
{ Multichannels t 

{Multichannel disabled} 
{Int. gen serial clock} 
{Receive frame sync reqd, width 0} 
{Transmit frame sync reqd, widthO} 
{Int trans, receive frame sync ena} 
{|l-law companding, 8 bit word len} 

AX0=0x0002; 

DM(0x3FF5)=AX0; {Generate 2.048 MHz ser:Lal clock} 

AX0=255; 

DM(0x3FF4)=AX0; {Divide by 256 for 8KHz samp rate} 

AX0=Ox0O00; 

DM(0x3FF3)=AX0; {SPORTO AUTOBUFF disabled) 

DM(0x3FF2)=AX0; {SPORT1 CNTL disabled} 

DM(0x3FFl)=AX0; {SPORT1 timing not used f 

DM(0x3FF0)=AX0; {SPORT1 timing not used I 

DM(0x3FEF)=AX0; {SPORT1 AUTOBUFF DISABLED} 

ICNTL=0x07; {Enable edge sensitive interrupt} 

IMASK=0x08; {Enable SPORTO Interrupt;} 

{ Wait for sample } 

wait: IDLE; {Wait until next sample appears at} 

JUMP wait; { SPORTO} 

{ Process sample } 

sample: AX0=RX0; {Put received sample in AX0} 

TX0=AX0; {Transmit sample value :_n AX0} 

RTI; {Return from Interrupt} 

.ENDMOD; {End of TALKTHRU Progran} 



Listing 5-1 : Talk Through Program 
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The program begins with a comment field containing a brief summary and information 
for its execution. The first non-comment statement is the .MODULE directive signaling the 
beginning of the program at absolute address and boot page 0. The next seven lines contain 
interrupt service vectors, each four instructions long. The first of these is the start interrupt 
which transfers program control to start: label. Since SPORT0 provides our I/O, only the fourth 
vector provides the necessary servicing (which in this case is directed towards sample: label). 
Therefore the remaining interrupt vectors provide no servicing. This block of vectors is a 
required feature and must be used in all programs. Depending on each program's I/O, the way 
each interrupt is serviced will change accordingly. 

After the execution begins, the program jumps to start:. Here for the next thirty lines, the 
data memory mapped control registers are setup. In this first program, we explicitly initialized 
all control registers. According to the ADSP-2I01 User's Manual [1], the registers have default 
values at startup and only few registers are required to be setup. However, it is a good pro- 
gramming practice to initialize all control registers and avoid any possible problems later on. 
For a complete explanation on the initialization of these registers, refer to Appendix D of [1]. 

The first of these is the System Control Register at DM location 0x3FFF. It is initialized 
so that SPORT0 is enabled (bit 12), FI/FO pins are enabled and PM wait state, BOOT wait state 
and BOOT page are set to 0. The next instruction toggles the FLAG OUT pin so that the 
corresponding LED lights up on the EZ-LAB board. Since the Emulator lacks trace capability, 
this instruction can be used to debug programs by suitably placing it in a suspect part of the 
code. By looking at the status of this LED, we can determine which part of the program is being 
executed. Otherwise, it does not contribute to the working of the program. Control registers 
from DM locations 0x3FF7 to 0x3FFE are all reset to 0. Therefore, all DM wait states are set 
to 0, and TIMER and multichannel operations are disabled. 

Since SPORT0 is enabled, its Control Register located at 0x3FF6 is an important one. In 
this register bits 0-2, 5, 8, 9, 1 1, 13 and 14 are set to 1. This selects u-law companding, 8-bit 
word length, and transmit/receive frame sync. Also the serial clock is internally generated. 
Note that bits 0-3 denote serial word length parameter SLEN and the word length is equal to 
SLEN + 1. Therefore, SLEN is set to 7. Most of the values in this register will be same for 
many programs in this laboratory. 

The next two registers are important in setting the sampling rate for our operation. The 
SPORT0 Serial Clock Divide Modulus (SCLKDIV) register at 0x3FF5 generates the serial 
clock frequency (SCLK) according to the following formula: 

_ CLKOUT 
2(SCLKDIV+ 1) 

where CLKOUT is the processor frequency. In our lab setup, it is assumed that the ADSP-2101 
is operating at 12.288 MHz. Therefore, SCLKDIV = 2 provides a serial clock frequency of 
2.048 Mhz. The sampling frequency is derived from SCLK using the value in the RFSDIV 
register located at 0x3FF4. It is given by, 

SCLK 

Sampling Frequency = RFSplv + , 
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Hence a value of 255 in RFSDIV gives a sampling frequency of 8KHz, su table for speech 
signals up to 4KHz in bandwidth. From the above two formulas, it can be seen :hat the sampling 
frequency depends on two parameters, SCLKDIV and RFSDIV. This also means that a wide 
range of sampling frequencies can be generated and that there is no unique way of determining 
a sampling frequency. 

The next five memory mapped control registers are reset to zero since their functions are 
not required in this program. The control registers ICNTL and IMASK, which are not memory 
mapped, are also important in every program. The ICNTL register, which is 5-bits wide and 
which controls the interrupt sensitivity, is set to 7 so that all interrupts are edge sensitive. Finally 
the IMASK register which is 6-bit. 1 ; wide is set to 8 so that SPORTO receive interrupt is enabled. 
It is a good programming practice to set this register, after all other parameters of the serial 
ports are initialized and before interrupts are allowed, to prevent the processor from vectoring 
to an interrupt routine before the set-up is complete. Note that the ICNTL register should be 
set before the IMASK because as soon as the IMASK is set, interrupts are accepted and the 
processor could vector before the ICNTL is executed. 

So far in this program, we spent almost thirty lines to setup the processor. It seems a lot 
for this simple A/D — D/A conversion program. However note that we covered all control 
registers even though not every one of them is required, so that we will be exposed to them and 
be aware of them for any future use. Now the program is ready to process input samples. The 
processor loops on the IDLE instruction until the interrupt from SPORTO is received. The 
program is thus interrupt-driven. When the interrupt occurs, program control shifts to the service 
routine by first jumping to location OxOOOC (jump sample) and then jumping to location sample:. 

All further activity takes place in the interrupt service routine. The input sample which 
is in the RXO register of SPORTC is moved to the AXO register. Note that any internal data 
register can be used. In the next cycle, the input sample is moved from AXO to TXO, the transmit 
register of SPORTO. After the return from interrupt, execution resumes at the wait: loop. The 
last statement .ENDMOD signals :he end of the program. 

Now we are ready to execute this program and at the same time check out the EZ-ICE 
and EZ-LAB hardware. Copy TALKTHRU.DSP from the diskette to your working directory. 
Before beginning this demonstration, make sure that all systems are properly powered up and 
running. The EZ-ICE should be configured according to the architecture description file 
EZLAB 1 . ACH with overlay memory disabled. 

The first task is to assemble the program. At a DOS prompt, issue the folk wing command. 

asm21 talkthru 

This file should assemble properly without any errors. The assembler creates three new files 
with extensions .OBJ, .CDE, .IN" which are used by other cross-software programs. Now 
create the binary executable file .EXE by linking the object code. Issue the following command 
at the DOS prompt. 



Id21 talkthru -a ezlabl -e 
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Once again the linker should produce TALKTHRU.EXE without any error. This file can now 
be loaded into the EZ-ICE. Start the terminal emulation program and activate the Emulator. 
This is discussed in Chapter 4. The Emulator will respond with its command menu. Before 
loading the program, it is a good practice to reset the processor. From the comm and menu select 
Reset Processor option. Then select Download From Host option and load the TALKTH- 
RU.EXE file. Our first program is now ready to control the ADSP-2101. 

Run the program by selecting Run Until Breakpoint. A "Processor Running" message 
will appear in the command menu window. If the entire system is functioning properly, an 
audible click will be heard from the speaker. The microphone is now connected to the speaker 
through the processor and any speech spoken into the microphone will be heard from the speaker. 
By pressing any key on the keyboard the processor is halted, thereby breaking the connection. 
Try several times repeating this process. 

Through this simple talk through program we covered most of the aspects of ADSP-2101 
processor's programming. Any new features will be dealt in similar detail as we go along. For 
Experiment 1 A, we will once again consider the same A/D — D/A conversion but with emphasis 
on indirect addressing. In Experiment IB, we will study the aliasing phenomenon which is the 
effect of sampling. 

EXPERIMENT 1A 

In Listing 5-1, all memory mapped control registers were initialized using the direct 
addressing mode. In this mode, the DM location was addressed directly using an hexa- 
decimal address and was assigned a value from one of the data registers (in this case AXO). 
This portion of the program is not completely legible because an hexadecimal address of 
the control register does not correlate with the function of that register. However, It is 
possible to give symbolic names to these addresses through the .CONST directive of the 
Assembler. For example, consider the directive 

. CONST sys_cont_reg=0x3FFF ; 

in which the variable syscontreg (a symbolic name for the system control register) is 
set to the required address of the system control register. Now in the initialization portion 
of our program we can use the following instruction: 

DM (sys_cont_reg) =AX0 ; 

instead of 

DM(0x3FFF)=AX0; 

This style of programming will make all our programs readable especially when there is 
a need for debugging. Furthermore, it is not necessary to actually insert these .CONST 
directives in the main program. We can create a header file, say const. h , containing all 
the necessary .CONST directives and then include it in the main program using the 
directive 

.INCLUDE <const.h>; 

This directive keeps the main program short and avoids duplicating .CONST directives 
in every program which needs them. 
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The ADSP-2101 , at reset, initializes all necessary control registers so that serial ports, 
autobuffering, timer, and inierrupts are disabled. Therefore, write and execute a new 
program EXPMT-1A.DSP containing the initialization approach using symbolic names 
for those control registers which were required in the Talk Through program. Follow the 
procedure shown below: 

1. Provide symbolic names for the control registers mapped at the locations 0x3FFF, 
0x3FFE, 0x3FF6, Ox3FF;5, and 0x3FF4. 

2. Create a text file, const.h, containing the appropriate .CONST directives for the above 
locations. 

3. Modify the TALKTHRU.DSP program by inserting a .INCLUDE directive and 
rewriting the initialization code using the symbolic names created in the const.h file. 

3. Save the new program as EXPMT- 1 A. DSP. 

4. Assemble, link and execi te this program in the Emulator. 

5. Verify that this talk through program is also working properly. 

There is another approach using indirect addressing for initializing memory-mapped 
control registers of the ADSP-2101. In this approach, the DM location is accessed via Index 
and Modify registers in the DAG bt t a data value can be directly transferred to a ly DM location. 
This saves one cycle of execution. However it is not a recommended approach. ] f DAG registers 
are incorrectly initialized or mistakenly overwritten, the control registers will be incorrectly set. 
This type of error can be difficult to detect in programs. 

5.2.2 Studying the Effect of Aliasing 

From the theory of sampling, we know that if an analog signal is sampled at F s samples per 

second then the resulting discrete s gnal has frequencies up to FJ2 Hz. Any frequencies above 
FJ2 Hz are aliased into lower freque ncies. We will verify this fact using our ADSP-2 1 1 system. 
In order to "see" aliasing, we will have to connect an oscilloscope to the output port. In this 
section, we will learn how to use the memory mapped D/A converter (DAC) of tie target system 
EZ-ICE for this purpose. We will also change the sampling frequency to 4 KHz so that all 
frequencies above 2 KHz will be aliased. We will call this program ALIASING.DSP and is 
shown in Listing 5-2. 

The main difference between this and the previous program is the use of DAC port of the 
EZ-LAB. (The architecture description file EZLAB1.ACH contains the port addresses.) The 
two .PORT directives declare memory mapped I/O ports in the DM at 0x1000 {WRITE _D A CO) 
and 0x2000 (LOAD DAC, which is used to update DAC outputs). Because of DAC setup timing 
requirements, two wait states are required when writing to the DAC. Therefore DM Wait State 
Control Register at 0x3FFE is set so that its DWAIT2 field is equal to 2, which makes the 
register value equal to 0x0080. F nally, two extra lines of code are needed in the SPORT0 
receive service routine to load and transfer data to DAC0. 

Copy ALIASING.DSP from the diskette to your working directory. Then assemble and 
link to create ALIASING.EXE using the procedure discussed in the previous section. Connect 
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ALIASING. DSP 



{ ADSP-2101 Program to demonstrate Aliasing 



This program is intended to demonstrate aliasing effects . It; takes an 
input sample from serial port receive register and outputs! it to 
serial port transmit register. In addition, the output sample is also 
available at DACO of EZ-LAB to which an oscilloscope can be connected. 
The serial clock is internally generated and the sampling rate is 4 
KHz. 

This program is written for EZ-ICE and EZ-LAB system with ESILABl.ACH 
architecture file. Assemble using ASM21.EXE and link using LD21.EXE to 
produce ALIASING . EXE . Load ALIASING . EXE in EZ-ICE and execute. 



} 

. MODULE/RAM/ ABS=0 ALIASING; 
. PORT WRITE_DACO; 
. PORT LOADJDAC; 

{ Interrupt Vectors 

JUMP start; NOP; NOP; NOP; 

RTI; NOP; NOP; NOP; 

RTI; NOP; NOP; NOP; 

JUMP sample; NOP; NOP; NOP; 

RTI; NOP; NOP; NOP 

RTI; NOP; NOP; NOP 

RTI; NOP; NOP; NOP 

{ Initializations : similar to TALKTHRU program} 

start: AX0=0xl000; 



{Beginning of ALIASING Program} 
(Memory Mapped DAC port at 0x1000} 
{Memory Mapped port at 0x2000} 
} 

{Start Interrupt} 
{External Pin Interrupt 3IRQ2} 
{SPORT0 Transmit Interrupt} 
{SPORT0 Receive Interrupt} 
{ SPORT1 Transmit Interrupt } 
{SPORT0 Receive Interrupt} 
{TIMER Interrupt} 



DM(0x3FFF)=AX0; 

AX0=0x0080; 

DM(0x3FB'E)=AX0; 

AXO=0X0O00; 

DM(0x3FFB)=AX0; 

DM(0x3FFC)=AXO; 

DM(0x3FFD)=AX0; 

DM(0x3FE9)=AX0; 

DM(0x3FFA)=AX0; 

DM(0x3FF7)=AX0; 

DM(0x3FF8)=AX0; 

AX0=0x6B27; 

DM(0x3FF6)=AX0; 

AX0=0x0002; 

DM(0x3FF5)=AX0; 

AX0=511; 

DM(0x3FF4)=AX0; 

AX0=0x0000; 

DM(0x3FF3)=AX0; 

DM(0x3FF2)=AX0; 

DM(0x3FFl)=AX0; 

DM(0x3FF0)=AX0; 

DM(0x3FEF)=AX0; 

ICNTL=0x07; 

IMASK=0x08; 

{ wait for sample 

wait: IDLE; 

JUMP wait; 

{ process sample 

sample : AX0=RX0 ; 

TX0=AX0 ; 

DM (WRITE_DAC0 ) =AX0 ; 
DM ( LOAD_DAC ) =AX0 ; 
RTI; 

.ENDMOD; 



ORT0 enabled} 
{DWAIT2 = 2} 

{TIMER not used , cleared} 

{Receive Multichannels} 

{Transmit Multichannels} 

{Multichannel disabled} 
{Int. gen serial clock} 

{Generate 2.048 MHz serial clock} 

{Divide by 512 for 4KHz rate} 

{SPORT0 AUTOBUFF disabled} 
{SPORT1 CNTL disabled} 
{SPORT1 timing not used} 
{SPORT1 timing not used} 
{ SPORT 1 AUTOBUFF DISABLED} 
{Enable edge sensitive interrupt} 
{Enable SPORT0 Interrupt > 
-} 

{Wait until next sample appears at} 
{ SPORT0 } 



{Put received sample in AX0} 
{Transmit sample value in AX0} 
{Load DACO register} 
{Transfer data} 
{Return from Interrupt} 
{End of ALIASING Program} 



Listing 5-2: Program to demonstrate Aliasing 
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an oscilloscope to the DACO terminals of the EZ-LAB. Activate the Emulator, load ALIA- 
SING. EXE and execute the program. The ADSP-2101 processor is now available to verify 
aliasing effects. Speak into the rricrophone and listen to the output signal. Also observe the 
signal on the oscilloscope. Can you characterize this output speech signal? 

Experiment IB 

In this experiment, we will connect various signal sources and study the effect of 
aliasing on them. We will also learn how to change the sampling frequenc y in the Emulator 
without changing the original program. 

1. Edit EXPMT-1A.DSP program and change the sampling frequency to 4 KHz. Call 
this EXPMT-1B.DSP program. Assemble, link and load EXPMT-1B.EXE in the 
Emulator. 

2. Connect a signal genera :or to the microphone port (disconnect microphone). Test 
your program for sinusoidal waveforms from 100 Hz to 4 KHz. Record your obser- 
vations. 

3. Select a square wave input with fundamental frequency of 1 KHz and record your 
observations. Repeat with fundamental frequency of 1.5 KHz. Can you justify these 
observations? 

4. We want to change the sampling frequency from 4 KHz to 6 KHz. One way to do 
this is to edit EXPMT-1B.DSP, make the necessary change and then assemble, link 
and execute the program There is another approach. We can make a direct change 
in the program memory from the Emulator's command menu without going through 
edit-assemble-link process. Halt the processor by pressing any key and select Read 
Program Memory option. From disassembled contents, locate the instruction, 
AX0=5 1 1 , which is used :o set sampling frequency to 4 KHz. We want to change this 
instruction to AX0=341 (why?). From the command menu, select Write Program 
Memory option. Enter thj proper hexadecimal address in program memory at which 
the above change is to take place. The Emulator requests for the content of the program 
memory value. Unfortunately, this content must be a six-digit hexadecimal value 
which should be the opcode for AX0=341. Refer to Cross-Software Manual [2] to 
determine this opcode. After making this change we have a new A/D — D/A connector 
operating at 6 KHz rate. Verify this sampling frequency using sinusoidal signals. 



5.2.3 Exercises 



1. In the Talkthru prograrr, incorporate DACO output and display the output speech 
signal on an oscilloscope. 

2. The audio circuitry of the EZ-LAB contains a codec to which a microphone and 
speaker are connected. This codec is designed to operate at an 8 KH? sampling rate. 
Therefore our attempt to change the sampling frequency in this section may result in 
unpredictable behavior c f the codec. (Does your observation so far confirm this?). 
Hence we will employ a different technique to reduce the sampling f requency while 
still operating the codec at an 8 KHz rate. Edit EXPMT-1B.DSP and change the 
sampling frequency back to 8 KHz. Modify SPORT0 Receive Interrupt service 
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routine so that every other sample is transmitted to the codec and to the DACO. This 
is called subsampling, or down sampling, and will result in 4 KHz sampling frequency. 
Verify the operation of this new program using speech as well as sinusoidal signals. 

3. Write a program so that every other speech sample from the microphone is sign 
reversed before it is transmitted to the speaker through a SPORTO register. This can 
be accomplished by using a register which toggles between 1 and - 1 along with a 
conditional IF statement which checks the sign of this register before sign reversing 
the input sample. Mathematically, this operation multiples the input sequence by { 1 , 
-1,1,-1,...) sequence. It modulates signal from a base band of Hz to 4 KHz which 
causes frequency inversion due to aliasing. This is known as speech scrambling 
through frequency inversion. Characterize the frequency response of the output 
sequence in terms of that of the input sequence and justify the name frequency 
inversion. Speak into the microphone and record your observations regarding the 
scrambled speech. 

5.3 ELEMENTARY DSP OPERATIONS 

After getting the ADSP-2101 processor to perform the simplest operation of A/D and D/A 
conversion, we are now ready to program three elementary DSP operations. These are: delay 
(data move), scale (constant multiplier), and add operations. The delay operation provides a 
shift of one to several sample-intervals or even shifts of a few seconds. In the Delay section, 
we will simulate an acoustic delay-line by delaying the input signal up to 1 second (maximum 
possible in EZ-ICE). The constant multiplier scales a sample while the adder adds two signal 
samples to form another sample. In fact almost all DSP algorithms can be done using these 
three elementary operations and the ADSP-2101 processor is optimized for these operations. 
In the Echo section, we will simulate an acoustic echo by adding an input sample to its delayed 
version and suitably scaling the sum to avoid overflow in the processor. In addition to their 
learning value, these programs also provide interesting audio effects. 

The single most important programming concept in this experiment is the circular buffer. 
The incoming data is stored in the buffer whose length is equal to the total amount of delay (in 
number of samples) we want to generate. The stored value in a buffer location is fi rst transmitted 
to the serial port and then the incoming sample value is stored at that buffer location. The buffer 
location is advanced by one and the process continues. Due to the circular nature of the buffer, 
the stored sample is read out after one rotation (equal to the length) of the buffer thus providing 
the necessary delay. The addressing of the circular buffer is provided by the index, modify and 
length registers. 

In the last experiment, we executed programs on the Emulator. As a learning experience, 
we will execute the Delay and Echo programs in this experiment from the Simulator. Obviously 
the microphone and the speaker cannot be interfaced with the Simulator. However their actions 
can still be simulated through disk files. This requires using the Simulator commands to open 
data files as serial ports and also to simulate interrupts. Once we are satisfied by verification 
on the Simulator that these programs are working properly, then we will execute them on the 
Emulator and demonstrate their audio effects. This is the framework of our Experiment-2. 
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5.3. 1 A Delay Program 

In this program, we will demonstrate the delay operation using the Simulator. If x(n) is the 
input sample and y(n) is the output sample then the operation performed by this program is 

y(n) = x(n-N) 



where N is the amount of delay in samples. Since the program is executed in the Simulator, 
there are some particular features which we will explain in detail. The program DELAY. DSP 



{ ADSP-2101 Delay Program 



DELAY . DSP 



This program takes an irput sample from memory-mapped port in at loca- 
tion 0x1000 and outputs it to another memory-mapped port_o.it at loca- 
tion 0x1001 after a delay of 10 samples. It is intended for the 
Simulator. The interrupt used is IRQ2, which must be simulated in the 
Simulator. An appropriate architecture file must also be written to 
include memory mapped I/O ports. 

Assemble this program using ASM21.EXE and link using LD21.1SXE to pro- 
duce DELAY . EXE . Load DELAY . EXE in the Simulator, open file INPUT . DAT 
at 0x1000 as input and CUTPOT.DAT at 0x1001 as out out . Simulate IRQ.2 
interrupt and then execute . 



} 

. MODULE/RAM/ ABS=0 DELAY_SIM; 

. PORT port_in ; 

. PORT port_out ; 

. CONST delay_size=10; 

.VAR/DM/CIRC cir_buff [delay_size] ; 

{ Interrupt Vectors 

JUMP start; NOP; NOP; NOP; 
JUMP delay; NOP; NOP; NOP; 



RTI; NOP; NOP; NOP; 
RTI; NOP; NOP; NOP 
RTI; NOP; NOP; NOP 
RTI; NOP; NOP; NOP 
RTI; NOP; NOP; NOP 
-Initialization: Circulac Buffer 
10=*cir_buff ; 
L0=%cir_buff ; 
M0=0 ; 
Ml=l ; 

{ Clear circular buffer 

CNTR=de 1 ay_s i z e ; 
DO clrjbuf UNTIL CE ; 
DM(I0,M1)=0; 
ICNTL=0x07; 
IMASK=0X20; 

{ Wait for Sample 

wait: IDLE; 

JUMP wait; 

{ Process sample and generate delay } 

delay : AX0=DM(I0,M0) ; 

DM (port_out ) =AX0 ; 

AX0=DM (port_in) ; 

DM(I0,M1)=AX0; 

RTI; 

{ End of program 



{Beginning of DELAY Program} 
(Memory mapped Input Pojrt} 
(Memory mapped Output Port} 

(Circular buffer, length del.ay_size} 

} 

(Start Interrupt} 
(External Pin Interrupt IRQ2} 
{ SPORT0 Transmit Interrupt } 
(SPORT0 Receive Interrupt} 
(SPORT1 Transmit Interrupt} 
(SPORT0 Receive Interrupt} 
(TIMER Interrupt} 



clr buf : 



-} 

(Index to top of cir_bui'} 
(Enable circular buffer} 
(No post-increment) 
(Post-increment by 1} 
} 

(Set up counter} 

(Initialize all cirjbuff values} 
{ to 0} 

(Enable edge sensitive interrupt} 
(Enable IRQ2 interrupt} 

(Wait until next sample} 



(Get delayed sample from cir_buf} 
(send delayed sample out} 
(Get input sample in AX0} 

(Return from Interrupt} 

(End of DELAY Program} 



Listing 5-3: Delay Program 
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The I/O operation for this program is simulated through disk files since the Simulator 
cannot interface with any real physical device. The Simulator is a piece of software and has no 
concept of time. Therefore, although it is possible to connect SPORTO receive and transmit 
registers to these files, operation at any sampling rate makes no sense in this case. Use of 
memory mapped I/O ports is sufficient to interface with this program (or any program written 
for the Simulator). Then one has to write an architecture description file to describe these I/O 
ports. Finally, one has to simulate the receipt of input samples as an interrupt. This can be done 
through the external interrupt IRQ2. Now all the elements of this program are in place. 

The program begins with assembler directives. Memory mapped input is described as 
port in which is mapped at DM address 0x1000 (this is arbitrary) while the output is described 
as port out mapped at DM location 0x1001. It is important that these values be specified in 
the architecture description file. The .CONST directive assigns the value 10 to the variable 
delay _size which then can be used in the program. The .VAR directive declares cir_buff 'as a 
circular buffer of size delay _size. 

In the interrupt vectors segment, only start and IRQ2 interrupt service routines are required. 
Since SPORTs, Timer, and external memory features are not used in this program, control 
register initializations are not required. 

The circular buffer is addressed via indirect addressing using DAGO. The first instruction 
in the initialization segment loads the index register with the address of the first memory location 
of the circular buffer cir_buff. This location is actually determined by the linker. Therefore 
" A " operator is used as a pointer to this address. Similarly "%" operator is used to indicate the 
length of the buffer which is assigned to the length register L0. M0 and Ml are modify registers 
which are set to the appropriate values as explained below. 

Next, the circular buffer is initialized to zero {or cleared) using the counter CNTR and a 
DO . . . UNTIL CE loop. The counter is set to the length of the buffer and the DO LOOP is 
executed until this counter expires. This completes the initialization of the DAG registers and 
circular buffer. Now interrupt sensitivity is set using the ICNTL register and the [RQ2 interrupt 
is enabled with the IMASK register. The program is ready to process input samples. The 
processor loops on the IDLE instruction until the interrupt is received from IR02. When this 
interrupt occurs, the program control shifts to delay: location. 

All further activities take place in the interrupt service routine. First the stored sample 
from the circular buffer is transmitted to the output port via the AXO register. Since M0 is 
equal to 0, 10 is not advanced and it still points to the same address location from which the 
sample is sent out. Then the sample from the input port is read into this location and 10 is 
advanced by one since Ml is equal to 1. This sample will be read out after one rotation of the 
circular buffer thus generating the required delay. 

Now we are ready to execute this program. Copy DELAY.DSP from the diskette to your 
working directory. Create an appropriate DELAY. SYS file and generate its DELAY.ACH file. 
Finally, assemble and link the Delay program to generate DELAY.EXE File. Do not forget to 
generate a .SYM file which is required in the Simulator. 
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We are now ready to execute this program in the Simulator. First we need input samples 
in a disk file. Create an ASCII INPUT.DAT file containing a few (at least twice the number of 
samples being delayed) sample values, with one value per line. Invoke the Simulator and open 
program and data memory windo ws. Execute the following commands from the command 
window: 

? r ,j ■■ 
1 'delay' 

o dm[0xl000] < ' input. clat' 

dm [0x1001] > 'output.dat' 

1 2 10 

g 

The first command resets the processor. The second command loads the delay program. The 
next two commands open disk files as memory mapped I/O ports at respective addresses. The 
fourth command simulates interrupt IRQ2. Since timing is not important and since the interrupt 
routines takes five cycles, the inteirupt is simulated after every ten instruction cycles. Finally, 
the last command executes the program. The program halts whenever the INPUT.DAT file 
runs out of sample values. Observe the program and data memory windows during the execution. 
After the program stops, exit from the Simulator and check the OUTPUT.DAT file to verify 
the operation of the Delay program. 

5.3.2 Overlay Memory 

The ADSP-2101 microcomputer has IK 16-bit words of data memory whici can be used to 
store samples in a circular buffer. This can simulate delays up to 128 miliseconds at an 8 KHz 
sampling rate. To generate delays up to 1 sec, we need a data memory of size 8K which then 
must be an external memory. The EZ-ICE probe has 8K 24-bit words of high speed static 
memory available as overlay memory. We will use this memory as data memory. It replaces 
selected portions of the target system's (EZ-LAB) memory. Use of the overlay memory is 
firmware controlled and is initially disabled upon probe reset. 

The position of jumper JP1 on the probe is used to select the segmem of ADSP-2101 
memory to be replaced by overlay memory. JP1 is located next to the RESET switch on the 
top of the probe. Jumper position 2 selects the overlay memory as data memory. This con- 
figuration still has to be enabled and the actual enabling of the overlay memory is software 
selectable from the initial menu, once an interface is established with the PC. 

Overlay memory begins at address 0x0000 and continues through addres:; Ox 1 FFF. Since 
EZ-LAb's D/A converters are memory mapped between 0x1000 and 0x1003, they are no longer 
available once overlay memory is enabled as data memory. This changes the environment of 
the target system and necessitates writing another appropriate system builder file to reflect 
external data memory beginning al absolute address 0. 

Before beginning this experiment, select position 2 of jumper JP1 to use the overlay 
memory. When you activate the Emulator through the terminal emulation program, do not 
forget to enable the overlay memory from the initial menu. Finally, generate an architecture 
description file EZLAB2.ACH using the System Builder. In case of difficulties, file 
EZLAB2.SYS is available on the diskette. 
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EXPERIMENT 2A 

Using the overlay memory, we will simulate a delay in the samples of up to 1 second 
in the Emulator. Also we will obtain input from the microphone and output samples to 
the speaker of the EZ-LAB. 

1. Modify the Delay program by incorporating code from the Talkthru program to 
simulate delays up to 1 second in the system operating at an 8 KHz sampling rate. 
(Circular buffer length must be 8000). Save this program as EXPMT-2A.DSP. 

2. Simulate delays of 10 ms, 100 ms, 500 ms and 1 sec. Demonstrate your program and 
record your observations. 

5.3.3 An Echo Program 

In a concert hall live performance, we hear sound in different stages. The direct sound is followed 
by multiple reflections or echoes. We will try to simulate this situation using our ADSP-2101 
system. Using one delayed reflection, the output sample y(n) is given by 

y{n) = ax{n) + $x(n-N); a + (3<l 

Using our Delay program it is easy to do this operation. We have to add a scaled delayed sample 
to the scaled current sample with proper care to avoid overflow. Listing 5-4 shows an Echo 
program written for the Simulator. 

{ ADSP-2101 Echo Program ECHO. DSP 

This program takes an input sample from memory-mapped port_in at loca- 
tion 0x1000. It outputs to another memory-mapped port_out at location 
0x1001 the average of the current sample and a sample delayed by 10 
samples. It is intended for the Simulator. The interrupt used is IK.Q2, 
which must be simulated in the Simulator. An appropriate architecture 
file must also be written to include memory mapped I/O ports. 

Assemble this program using ASM21.EXE and link using LD21.EXE to pro- 
duce ECHO. EXE. Load ECHO. EXE in the Simulator, open file INPUT . DAT at 
0x1000 as input and OUTPUT.DAT at 0x1001 as out out . Simulate! IRQ2 
interrupt and then execute . 

} 

. MODULE/RAM/ ABS=0 ECHO_SIM; (Beginning of ECHO Program) 

.PORT port_in; {Memory mapped Input Port.} 

. PORT port_out; {Memory mapped Output Port} 

. CONST delay_size=10; 

.VAR/DM/CIRC cir_buff [delay_size] ; {Circular buffer, length del«iy_size} 

{ Interrupt Vectors } 

JUMP start; NOP; NOP; NOP; {Start Interrupt} 

JUMP delay; NOP; NOP; NOP; {External Pin Interrupt IRQ2} 



RTI; NOP; NOP; NOP 

RTI; NOP; NOP; NOP 

RTI; NOP; NOP; NOP 

RTI; NOP; NOP; NOP 

RTI; NOP; NOP; NOP 



{SPORT0 Transmit Interrupt} 
{SPORT0 Receive Interrupt) 
{SPORT1 Transmit Interrupt} 
{SPORT0 Receive Interrupt} 
{TIMER Interrupt} 



{ Initialization: Circular Buffer } 

start: I0=*cir_buf f ; {Index to top of cirjbuf} 

L0=%cir_buf f ; {Enable circular buffer} 

M0=0; {No post -increment} 

Ml— 1; {Post-increment by 1} 

{ Clear circular buffer } 

CNTR=delay_size; {Set up counter} 

DO clr_buf UNTIL CE; {Initialize all cir_buff values} 

clr_buf: DM(I0,M1)=0; { to 0} 
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IMASK=0X20; {Enable IRQ2 interrupt \ 

{ Wait for Sample } 

wait: IDLE; {Wait until next samplo} 

JUMP wait; 

{ Process sample and generate delay } 

delay : AY0=DM(I0,M0) ; ' {Get delayed sample from cirjbuf} 

SI=DM(port_in) ; {Get input sample in S'.l) 

SR=ASHIFT SI by -1 (HI); {Scale down current sanple by 50%} 

AR=SR1+AY0; {Sum current and delayisd samples} 

DM (port_out) =AR; {send delayed sample out} 

DM(I0,M1)=SR1; {Move current sample to cirjbuf) 

RTI; {Return from Interrupt; 

{ End of program } 

. ENDMOD ; {End of ECHO Program} 

Listing 5-4: Delay Program 



The only difference between this and the previous program is in the interrupt service 
routine. The input sample is shifted to the right by one bit to obtain 50% sea! ing and added to 
the delayed sample (which was pre viously scaled by 50%) to produce the output sample. Execute 
this program in the Simulator using the same INPUT.DAT file as the source for the input samples. 
Check the OUTPUT.DAT file and verify the operation of this program. 

EXPERIMENT 2B 

In this experiment, we will simulate the concert hall effect using the Emulator. Make 
sure that you are using overlay memory and the corresponding .ACH file. 

1 . Modify the EXPMT-2A DSP file to include portions of the Echo program. Call this 
EXPMT-2B.DSP. 

2. Simulate echoes of duration 10 ms and 50 ms. Demonstrate your results. 

3. To obtain the proper concert hall effect, we need at least two reflection delays of small 
duration. Modify EXPMT-2B .DSP to include one additional delay wi th proper scaling. 
Experimentally demonstrate the concert hall effect using this new program. 

5.3.4 Exercises 

1. Implement the frequency inversion program of Exercise 5.2.3-3 using the operations 
discussed in this section. In particular, use a circular buffer of length 2 containing 
coefficients 1 and-1. 

2. Write an efficient program to implement the following input/output operation 

y(r,) = x(n) + ax(n -N) + a 2 x(n -2N) 

5.4 DIFFERENCE EQUATION IMPLEMENTATIONS 

One of the routine operations in DSP is a filtering operation. This operation is typically 
implemented in practice using a difference equation approach. A difference equation imple- 
mentation of a filter requires shifting, scaling, and summing of input/output samples to obtain 
a processed sample. In the last experiment, we studied these basic operations or the ADSP-2 1 1 
microcomputer which is optimized to perform difference equation operations efficiently. In 
this experiment, we will study and develop more intricate programs on difference equations, 
both in the Simulator and on the EZ-ICE/EZ-LAB system. 
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Digital filters are classified as FIR (Finite-duration Impulse Response) or IIR (infinite- 
duration Impulse Response) filters. FIR filters are characterized as Moving Average (MA) 
filters and their difference equation implementation leads to the familiar linear convolution 
operation. In the Convolution section, we will first study a simulator version of the program 
suitable for convolving two finite duration sequences. In Experiment 3A, we will develop a 
more detailed program to process a real-time (and hence of infinite duration) input signal on 
the Emulator. IIR filters, on the other hand, are characterized as recursive or Auto Regressive 
(AR) filters and are implemented by a general linear constant coefficient difference equation. 
In the Recursive Filter section, we will again study a simulator version of the program which 
computes an impulse response of a first-order recursive filter. In Experiment 3B , we will develop 
a program to implement a general difference equation on the Emulator. Filtering operations on 
speech signals can now be demonstrated using these programs. 

The single most important programming concept in this experiment is the multifunction 
instruction. The ADSP-2101 processor can perform multiply/add, data memory read, and 
program memory read operations in one instruction cycle. The internal architecture is designed 
around this concept for efficient DSP applications. Since the MA and AR parts of the difference 
equation involve multiply/add operations on sequentially stored data, one multifunction 
instruction placed inside a loop is sufficient to implement each part. This is the essence of our 
difference equation implementation. 

5.4.1 A Convolution Program 

In this program, we will implement an FIR filter using the linear convolution operation. If x{n) 
is the input and y(n) is the output of this filter then its difference equation representation is 
given by 

y(n) = b^{n) + b x x(n-\)+ ...+b M ^x(n-M + \) 

M-\ 

= X h(m)x(n -m) 

where b„,0 < n <M - 1 are filter tap weights, h(n) is the impulse response, and 

\b„ , 0<«<M-1 

h(n) = 

( , else 

lfx(n) is an N-point sequence then y(n) will be a A 7 +M - 1 -point sequence af:er filtering is 
completed. We will demonstrate this using the Simulator. The appropriate program CON- 
VOLV.DSP is shown in Listing 5-5. 

In this program, there are no input/output operations. Both x(n) and h(n) are loaded in 
the data memory from disk files, while y (n ) is available only in data memory. Since this program 
is intended for short duration sequences and since we have studied how to handle input/output 
in the Simulator, these operations are not simulated in this program. 
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{ ADSP-2101 Convolution Program 



CONVOLV.DSP 



This program performs a linear convolution of M-point sequence 'h(n) ' and 
N-point sequence 'x(n)'. It is intended for the Simulator. Sequence 
'h(n) ' is loaded from disk file 'H(N) . DAT' while sequence 'x(n) ' is loaded 
from file 'X(N) . DAT' . The output sequence ' y' is stored in some data 
memory locations. Architecture file EZLAB1.ACH is used for this program. 

Assemble this program using ASM21.EXE and link using LD21JIXE to produce 
CONVOLV.EXE and CONVOLV.SYM. Load these files in the Simulator and execute. 
Check output sequence y(n) using data memory window. 



I 



MODULE /RAM/ ABS=0 CONVOLV; 
CONST M = 8; 
CONST N = 10; 
VAR/DM/CIRC dm_h [M] ; 
VAR/DM/CIRC dm_x[M+N-l] ; 
VAR/DM dm_y[M+N -1] ; 

INIT dm h: <H(n) . DAT> ; 
INIT dm x: 



{Beginning of CONVOLV Program] 
{Size of the h(n) sequence: sat to 8} 
{Size of the x(n) sequence: s*t to 10} 
{Sequence h(n) in DM as cir_bnf} 
{Sequence [0, . . . , 0,x(n) ] in DM as cir_buf} 
{Output y in DM as linear_buf ) 
{Initialize h and x using datci from} 
{ disk files} 



start : 



JUMP start; NOP; NO?; NOP; {Restart Interrupt} 

{Disable ALL other interrupts } 



RTI; NOP; NOP; NOP; 

RTI; NOP; NOP; NOP; 

RTI; NOP; NOP; NOP; 

RTI; NOP; NOP; NOP; 

RTI; NOP; NOP; NOP; 

RTI; NOP; NOP; NOP; 



10 = *dm_h + M 
L0 = %dm_h; 

11 = A dm_x; 
LI = %dm_x; 

12 = A dm_y; 
L2 = 0; 

M0 = -1; 

Ml = N; 

M2 = 1; 



1; {Index to Circular Buffer h G M-l} 

{Length of Circular Buffer h } 
{ Index to Circular Buffer x ) 
{Length of Circular Buffer x } 
{ Index to Circular Buffer y ) 
{Linear Buffer y } 
{Post -decrement by 1 } 
{Post-increment by N } 
{Post-increment by 1 } 



CNTR = M+N - 1; 
DO loopl until CE; 

CNTR = M; 
MR = 0; 

DO loop2 until CE; 

MX0 = DM(I0, M0 ) ; 
MY0 = DM(I1, M2 ); 
loop2: MR = MR + MX0 * MY0 ( SS ); {Do convolution } 

DM(I2,M2) = MR1; {Store output } 

loopl: MODIFY ( II, Ml ); {Next data } 

.ENDMOD; {End of CONVOLV Program } 

Listing 5-5: Convolution Program 

The program as usual begin:; with assembler directives. The .CONST directives assign 
values to the lengths of the respective sequences. The .VAR directives assign data memory 
buffers to the h(n), x(n), and y{n) sequences. Even though this program performs linear 
convolution, dmji and dm_x buffers are declared as circular buffers. It takes care of the 
addressing problem at the start of the convolution loop. 



{Set up counter for LOOP1 } 
{Beginning of LOOP1 } 

{Set up counter for LOOP2 } 
{Clear MR register } 
{Beginning of LOOP2 } 

{Load MX0 register with input } 
{Load MY0 register with input } 
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The new directive we have is .INIT, which initializes the data buffers from values stored 
in disk files. These files must be created by the user and should be located in the directory from 
which the Linker is invoked. The standard format for the data file is a single four- or six-character 
hexadecimal number per line of input (carriage returns are ignored). Refer to the Cross-Software 
Manual [2], appendix B for more details on the file format. 

At this juncture, a discussion on the format of these two buffers, dmh and dmx, is in 
order. We are trying to simulate linear convolution using circular buffers. Referring to the 
above convolution equation, if we fold h(m) to obtain h(-m) and shift it with respect to x(m), 
then we can obtain the result by the multiply/add operation over the overlap of x(m ) and h(n -m). 
For the first M - 1 sample points, h(n - m ) will partially overlap with x (m ). Hence some of its 
samples will multiply with zeros. This is what we have to simulate. This can be done by padding 
dm_x by M - 1 zeros first followed by N values of x{m). Therefore the length of dm_x buffer 
is (N +M - 1). Similarly, for the last (M - 1) sample points, h(n -m) again partially overlaps 
with x (m ). This, however is correctly simulated by the circular nature of the dmx buffer (why?). 

Since this program has no external I/O, all interrupts except the RESTART interrupt are 
disabled. Following the start: label, initialization for all buffer address registers is done. Since 
we want to simulate h(-m), 10 points to the bottom of the dmji register and its index register 
is decremented by 1 using MO. L0 sets the length of the dm h circular buffer. The index register 
II points to the top of the dm x register and it is incremented by the M2 register during the 
convolution sum (in loop2), while II is incremented by N using Ml at the end of this sum (in 
loopl). This simulates the shifting of h(n -m) by one sample (why?). The index register 12 
points to the top of the dm_y buffer and is incremented by 1 using the M2 register. The L2 
register initializes the dm_y buffer as a linear buffer. 

The process of convolution now takes place inside two DO Loops. The outer loopl: 
computes N +M - 1 samples of the convolution while the inner loopl : computes th e convolution 
sum. Since the counter stack is four deep, the same CNTR can be used to regulate these DO 
Loops. The MR register is initialized before the accumulation of the sum. The ^ (n ) sequence 
values are brought in the MXO register while thex(«) sequence values are brought in the MYO 
register. The two are multiplied and the result is accumulated in the MR register. Note that the 
default mode of operation for the MAC is the fractional 1.15 format arithmetic of 2' s complement 
numbers. This is what is assumed in this program. At the end of loop2, the most significant 
fractional bits in MR 1 are transferred to dm_y buffer and the process continues until all samples 
are processed. 

We are now ready to simulate convolution in the Simulator using this program. Copy 
CON VOLV.DSP from the diskette to your working directory. Use EZLAB 1 .ACH architecture 
description file to assemble and link this program and generate CONVOLV.EXE and CON- 
VOLV.SYM files for the Simulator. Finally, create two data files; H(N).DAT containing 8 
hexadecimal values, and X(N).DAT containing 17 hexadecimal values (with seven leading 
zeros as discussed above). Do not forget to follow the file format discussed in appendix B of 
the Cross Software Manual [2]. 



Laboratory Experiments u: 

Invoke the Simulator and load the CONVOLV program. We can either single step through 
the Simulator to observe various register operations or execute the entire program. The program 
will stop when it tries to execute an undefined program memory location. At this time check 
the data memory corresponding to the dm_y buffer and verify that it does contain the proper 
convolution results. We have not provided any overflow protection in this program. Therefore 
depending on the values you choose in the data files (which should be fractions) we may get 
incorrect results. 



EXPERIMENT 3A 

In this experiment, we will develop a 
process speech signals in real-time. 

1. Using the CONVOLV program as a guide, write EJ 
incorporate the following features: 

• Input from the microphone and output to the speaker, 

• FIR filter coefficients in the Program Memory, 

• Input samples in the Data Memory, 

• Multifunction instruction inside the convolution loop, 

• Overflow protection (detection and saturation) 



n of the above program to 



tion: 



2. Using EXPMT-3A program, implement the followir 

l 4 ( i Y' 

y(n) = 2^ll) X{n ~ m) 

Determine and plot the magnitude response of the above filter and verify it exper- 
imentally using a sinusoidal source. 



3. Using the above filter, experiment with speech signals and record your observations. 

5.4.2 Recursive Filter Program 

An AR filter is a recursive (all-pole) filter described by the equation 

N 

y(n) = b x(n)+ Y a k y(n -k) 



In this program using the Simul; 
recursive filter given by 



we will compute the impulse response of a first-order 



y(n) = x{n) + Q.5y(n-\) 

When x(n) is an impulse sequence 8(«), the output of the filter y(n) is equal to the impulse 
response h{n). This is an infinite duration sequence, therefore the program computes only the 
first few samples. The appropria e program REC_FILT.DSP is shown in Listing 5-6. 



Sec. 5.4 DIFFERENCE EQUATION IMPLEMENTATIONS 



143 



{ ADSP-2101 Recursive Filter Program 



REC FILT.DSP 



This program implements the following first-order recursive filter: 

y(n) = x(n) + 0.5 * y(n-l) 
where y (n) is the output and x (n) is the input . The input sequence is 
an impulse "delta (n) " . Hence the output y(n) is the impulse response 
of this filter. The first 10 impulse response values are computed and 
stored in some data memory locations . This program is intended for the 
Simulator. Architecture file EZLAB1 . SYS is used for this program. 

Assemble using ASM21.EXE and link using LD21.EXE to produce REC_FILT.EXE 
and REC_FILT . SYM . Load these files in the Simulator and execute. Check 
the impulse response using data memory window. 



} 

. MODULE / RAM/ ABS=0 REC_FILT; 
. CONST samples=10; 
.VAR/DM/CIRC dm_y [samples] ; 



JUMP start; NOP; NOP; NOP; 

RTI; NOP; NOP; NOP; 

RTI; NOP; NOP; NOP; 

RTI; NOP; NOP; NOP; 

RTI; NOP; NOP; NOP; 

RTI; NOP; NOP; NOP; 

RTI; NOP; NOP; 



start: 10 = A dm_y; 

L0 = %dm_y; 
MO = 1; 

CNTR = samples; 
DO loopl until CE; 
loopl: DM(I0, MO) = 0; 

MY0=0x4000; 
MRl=0x7FFF; 

CNTR = samples ; 
DO loop2 until CE; 



(Beginning of REC_FILT Program} 
{Number of samples computed (10) } 
(Output y(n) buffer} 

(Restart Interrupt} 

(Disregard ALL other interrupts} 



(10 points to top of y buffer} 
(L0 = Length of buffer y} 
(Post -increment by 1} 

(Set Counter for LOOP1} 
(Initialize y buffer to 0} 



(Coefficient . 5 in MY0} 
(Set y(0) =1 ~= 0.111 . . 



1} 



(Set up counter for LOOP2} 

(Calculate difference eqn) 



loop2: DM(I0, M0) = MR1, MR=MR1 * MY0 ( SS ) ; (and store in y buffer} 
.ENDMOD; (End of REC_FILT Program) 

Listing 5-6: Recursive Filter Program 



This program is similar to the previous CONVOLV program in that it has no external 
input/output operations. Since jr(n) = 8(n), we don't need a circular buffer for the input 
sequence. In fact, 8(«) can be simulated by choosing x(n) = and y{-\) = 1 and then driving 
the recursive equation using this initial condition. Therefore the only circular buffer we need 
in this program is for the output sequence y(n). The coefficient 0.5 is stored in MY0 while the 
MR register is initialized to 1 . Note that in the fractional 1.15 format, the value 1 is approximated 
by 0x7FFF. The difference equation is simulated using a multifunction instruction and the DO 
Loop computes 10 samples of the impulse response. 

Obtain a copy of REC_FILT.DSP from the diskette and load it in your working directory. 
Use the EZLAB1.ACH architecture description file to assemble and link this program, and 
generate REC_FILT.EXE and REC_FILT.SYM files for the Simulator. Invoke the Simulator 
and load the REC_FILT program. We can either single step through the Simulator to observe 
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that it does contain proper impulse response results. 



various register operations or exec ute the entire program. The program will stop when it tries 
to execute an undefined program memory location. At this time check the data memory cor- 
responding to dm_y buffer and ve 

EXPERIMENT 3B 

In this experiment, we will develop a more general (N' h order) AR filter program to 



signals in ie<i 

1 . Using the REC_FILT program as a guide, write EXPMT-3B.DSP program to incor- 
porate the following features: 

• Input from the microphone and output to the speaker, 

• AR filter coefficients (a k 's) in the Program Memory, 

• Output samples in the Data Memory, 

• Multifunction instruct] on inside the recursion loop, 



• Overflow protection (detection and saturation) 

Using EXPMT-3B program, implement the following AR filter equation: 

K 



H{z) 



l+0.9z-'+0.81z- 



Determine and plot the pole-zero diagram and the magnitude response of the above 
filter. Determine the numerator K so that the maximum gain of the filter is equal to 
1 . Verify the operation of this filter experimentally using a sinusoidal source. 



3. Using the above filter, i 

5.4.3 Exercises 



;riment with ! 



i signals and record your observations. 



1 . Write a program to implement a 7-tap FIR filter with rectangular tap weights of equal 

(b„ = '-,0<n " 



angular window and v< 



cy response of this rect- 



- it us 



; a s: 



2. 



Consider the above 7-tap FIR filter in whic 
by triangualr tap weights given by: 

n 
1 

6-n 



b„ = 

" 



the freqi 
source. 

; rectangular tap weights are replaced 

0<« <3 
4<n <6 



Determine and plot the frequency response of this triangular window and verify it 
using a sinusoidal source. 

3. Combine the programs EXPMT-3A.DSP and EXPMT-3B.DSP to implement a 



general Auto-Regressive 



verage (ARMA) filter given b> 

M N 

y{n) = X b m x(n-m)+ X a k y(n-k) 



k=0 
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Using this program, implement the following All-Pass filter and verify its operation. 

H(z) = 



0.72-1.7z-'+z~ 2 
l-1.7z-'+0.72z- 2 



4. An interesting example of a stable recursive filter is reverberation in a good concert 
hall, whose impulse response may last for several seconds. This impulse response, 
h(t), is given by 

*(/) = 5(/-T) + a5(?-2T) + a 2 5(r-3x) + ... 
where the concert hall feeds back x seconds later a fraction, a, of its delayed input. 
For X = 1 ms and a = 0.993, the reverberation time (when the response decays down 
to 60 dB) of 1 second can be achieved. This reverberation effect can be simulated 
on the ADSP-2101 system. The difference equation for the above impulse response 
is 

y(n) =.\{n -N) + ay(n -N) 
in which the delay in samples, N, depends on the sampling frequency F, and is given 

by, 

N = xF s 

Write a program to simulate this reverberation effect for an 8 KHz sampl ing frequency 
and demonstrate its results. 
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Until now we studied operations that are necessary to implement the most basic DSP algorithms 
such as filtering. The common thread among all these algorithms is shift, multiply and add. 
However there are operations other than the above which are required in advance DSP appli- 
cations. Consider an example of an adaptive filter used in echo cancellation. An echo canceller 
is implemented as an FIR filter whose filter coefficients are not constant but must be changed 
adaptively to track varying characteristics of a typical telephone channel. The coefficients are 
changed according to some optimum strategy which may require operations such as a square-root 
or a transcendental function. Sometimes we may have a need to generate pssudo random 
numbers to implement a random source. Therefore we need a procedure to implement these 
operations using basic computational blocks of the ADSP-2101 microcomputer. 

Transcendental functions such as sines and logarithms are often approximated by poly- 
nomial expansions. The most widely used of these are the Taylor and Maclaurin series. They 
can be used to approximate almost any function whose derivative is defined over the specified 
input range. In this experiment we will develop subroutines to implement sine function 
approximation from a polynomial expansion and random number generation using the linear 
congruence method. Other transcendental functions can similarly be implemented using their 
appropriate polynomial expansions. Because the ADSP-2 1 1 performs single precision ( 1 6-bit) 
fixed point operations, the accuracy of a polynomial expansion decreases as the order of the 
polynomial increases. Therefore the order of the polynomial must be limited to the minimum 
using optimized coefficients for the polynomials in the function approximation. 



Laboratory Exp 



There are two important programming concepts in this experiment. First, the use of 
subroutines as a programming tool. We will implement these functions as subroutines and then 
call them up whenever needed. Therefore we will study how to interface the subroutines with 
a main program. To devise a waveform generator using functions, we need to generate function 
values at proper periodic time intervals, i.e., we need timing information. Hence the second 
important concept we will study is the use of the programmable interval Timer o f the ADSP-2 1 1 
to generate periodic interrupts. Uising these concepts we will generate and verify a single fre- 
quency tone at some frequency in Experiment 4A while in 4B we will produce and test a uni- 
formly distributed random noise s;quence. 

nerator 

tes the sine of the input 




(x) = 3.140625x + 0.< 



5367x -5.32: 




The approximation is accurate for any value of x from 0° to 90° (the first quadrant). However 
because sin(-x) = - sin(x) and sin(x) = sin( 1 80° - x), the sine of any angle can be obtained from 
the sine of an angle in the first quadrant. 

The subroutine, SINE.DSP, that implements this sine approximation, accurate to within 
two least significant bits (LSBs), is shown in Listing 5-7. This routine accepts input values in 
1.15 format. The coefficients, which are initialized in data memory in 4.12 format, have been 
adjusted to reflect an input value scaled to the maximum range allowed by this format. On this 
scale, 1 80° equals the maximum positive value, 0x7FFF, and - 1 80° equals the maximum negative 
value. 0x8000, as shown in Figure: 5-1. 

H#4000 = ji/2 



H#7FFF = 71 

-+ 



H#8000 = -t 




H#FFFF 



H#C000 = -Ti/2 
}5-1: Scaled j 
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The routine shown in Listing 5-7 contains a new assembler directive called .ENTRY. It 
makes the label sin; visible to other programs (especially the main program) for use in subroutine 
calls or inter-module jumps. The routine then reads the scaled input angle from AXO. This 
angle is first modified to generate the angle in the first quadrant that will yield the same sine 
(or negative sine). If the input is in the second or fourth quadrants (bit 14 of the input value is 
a one) the input is negated to produce the twos complement, which represents an angle in the 
third or first quadrant, respectively. The sign bit of this angle is cleared to produce an angle in 
the first quadrant, and this result is stored in AR. 





{ ADSP-2101 Sine Program 



SINE. DSP 



This subroutine calculates sine value, Y = sin( X ), by using a polyno- 
mial approximation: 

sin (x) = 3 . 140625X+0 . 02026367x A 2-5 . 325196x A 3+0 . 5446778x A 4+l . 800293x A 5 

Calling Parameters 

AXO = X in scaled 1.15 format 
M3 = 1 
L3 = 
Return Value 

AR = Y in 1.15 format 



} 

. MODULE / RAM SINE_FUNCTION; { Beginning of SINE Program } 

. VAR/DM sin_coef f [5] ; { Sine Polynomial Coefficients } 

. INIT sin_coeff : H#3240, H#0053, H#AACC, H#08B7, H#1CCE; 

{ Initialize coefficients } 



. ENTRY sin; 



{ Entry point of sine function } 
{ Index to coefficients } 

} 



{ Check 2nd or 4th quad. 
{ If yes, negate input } 

{ Remove sign bit } 



approx : 



I3= A sin_coef f ; 
AY0=H#4000; 

AR=AX0, AF=AX0 AND AYO; 
IF NE AR='-AX0; 
AYO = H#7FFF; 
AR = AR and AYO; 
MY1=AR; 

MF=AR*MY1 ( RND ), MX1=DM( 13, M3 ); { MF= 
MR=MX1*MY1 ( SS ) , MX1=DM( 13, M3 ) ; { MR= 
CNTR=3; 

DO approx UNTIL CE; { Calculate sine value 

MR=MR+MX1*MF (SS) ; 

MF=AR*MF (RND ), MX1=DM( 13, M3 ); 
MR=MR+MX1*MF (SS) ; 
SR=ASHIFT MR1 BY 3 ( HI ) ; 
SR = SR OR LSHIFT MRO BY 3 
AR=PASS SRI; 
IF LT AR=PASS AYO; 
AF=PASS AXO; 
IF LT AR=-AR; 
RTS; 



X * X ) 
CI * X } 



{ Convert to 1.15 format 

( LO ); 



. ENDMOD ; 



{ Saturate if needed } 
t Negate if needed ) 
{ End of SINE Program } 



Listing 5-7: Sine Approximation 

If the original angle is in the first quadrant, its value is unchanged. If it is in the second 
quadrant, negation changes it to the third quadrant, and the sign bit removal changes it to the 
first quadrant. If the original angle is in the third quadrant, the removal of the sign bit changes 
it to the first quadrant. An angle that is originally in the fourth quadrant is changed to the first 
quadrant by negation. 
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The sine of the modified ang le is calculated by multiplying increasing powers of the angle 
by the appropriate coefficients. The square of the angle is computed and stcred in MF while 
the first coefficient is fetched from data memory. The first term of the sine jpproximation is 
stored in the MR registers (in which the result is subsequently accumulated) in parallel with the 
second coefficient fetch. In the approx: loop, the next term of the approximation is computed 
and added to the partial result in MR; then the multiplication instruction fetches the next 
coefficient and generates the next power of the angle at the same time. 

Because the coefficients are in 4. 1 2 format, a shift instruction is needed to scale the result 
to 1.15 format. The result is then checked for overflow. If the value in SRI exceeds 0x7FFF, 
the routine saturates the result at ths maximum positive value, 0x7FFF, which is read from AYO. 
Then the sign of the result is restored, if necessary. If the input angle (stored in AXO) is negative, 
the result must be negated. 

This routine requires 25 cycles to generate one sine value. At 12.5 MI12: processor speed 
this means that one sine value is available in 2 microseconds. 

A simple simulator main program to generate sin(n/2) using the above tine subroutine is 
shown in Listing 5-8. It contains the assembler directive .EXTERNAL to interface a subroutine 
with the main program. This directive assigns the external attribute to the sin: label declared 
in another module using the .ENTRY directive. Therefore it is now possible to call the sine 
subroutine from the main program. Since the calling parameters of this subroutine are AXO, 
M3, and L3 these must be initialized before the subroutine is called. AXO is set to 0x4000 
which corresponds to tc/2 radians. After the program returns from the subroutine, the sine value 
will be in the AR register in 1.15 format. 



.MODULE/RAM/BOOT=0/ABS=0 MAIN; 
.EXTERNAL sin; 

JUMP start; NOP; NOP; NOP; 



RTI; 


NOP; 


NOP; 


NOP; 


RTI; 


NOP; 


NOP; 


NOP; 


RTI; 


NOP; 


NOP; 


NOP; 


RTI; 


NOP; 


NOP; 


NOP; 


RTI; 


NOP; 


NOP; 


NOP; 


RTI; 


NOP; 


NOP; 


NOP; 



start: AX0=0x4000; 
M3=l; 
L3=0; 
call sin; 

.ENDMOD; 

Listing 5-8: Main Program 

To generate one executable file, assemble the main program and the subroutine separately. 
Then link the two using the EZLAB1.ACH architecture description file. Finally execute the 
main program in the simulator. Verify that the correct sine of n/2 is in the register AR1. 
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5.5.2 TIMER Operation 

So far we generated one sine value using the subroutine. To produce a sine waveform we have 
to compute several sine values over the to 2n interval (i.e. over one cycle) and generate these 
samples at periodic intervals. The time interval between two consecutive samples determines 
the frequency of the waveform. This periodic time interval generation is accomplished by the 
Timer unit of the processor. Therefore we have to study the Timer in detail in order to understand 
its operation. 

The ADSP-2101 Timer includes two 16-bit registers, TCOUNT and TPERIOD and one 
8-bit register, TSC ALE. These registers are memory mapped: TPERIOD at 0x3FFD, TCOUNT 
at 0x3FFC and TSCALE at 0x3FFB. The mode control instruction enables and disables the 
timer by setting and clearing bit 5 in the mode status register, MSTAT. The Timer must be 
enabled before its counting capabilities can be used. 

TPERIOD is the period register which holds the period of the interrupt in cycles. TCOUNT 
is the count register. When the timer is enabled, TCOUNT is decremented as often as once 
every instruction cycle. When the counter reaches zero, an interrupt is generated. TCOUNT 
is then reloaded from the TPERIOD register and the count begins again. TSCALE stores a 
scaling value that is one less than the number of cycles between decrements of TCOUNT. For 
example, if the value in TSCALE register is 0, the counter register decrements once every cycle. 
If the value in TSCALE is 1 , the counter decrements once every two cycles. Therefore using 
these three registers, interrupts from 5.24 ms (when TPERIOD is at maximum and TSCALE at 
minimum with resolution of 80 ns) up to 1 .34 seconds (when both TPERIOD and TSCALE are 
maximum with resolution of 20.48u.s) at 80ns cycle time can be generated. 

EXPERIMENT 4A 

In this experiment we will generate a sinusoidal waveform at 3 KHz frequency with 16 
samples per cycle and display it on the scope. 

1. Using the SINE.DSP subroutine, write EXPMT-4A.DSP program to incorporate 
the following features: 

• Output to the DACO port, 

• 1 6 samples per cycle. Store 1 6 angle values in Data memory as a circular buffer 
and initialize them from disk file. 

• Properly set Timer registers for 16 samples per cycle and 3 KHz frequency. 

• Enable Timer interrupt by setting bit in IMASK. 

• Enable Timer by setting bit 5 in the mode status register MSTAT. 

2. Using EXPMT-4A program, generate the sinusoidal waveform, display it on the 
scope and verify that the generated frequency is correct. 

3. Also verify the generated frequency by "listening" to it on the speaker iind comparing 
it with one generated using the function generator. 
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5.5.3 Uniform Random Number Generator 

Although the generation of a random number is not, strictly speaking, a function, it is a useful 
operation for many applications. One such application is in high-speed modems, in which it 
can be used as a training signal for the adaptive equalizer. The means for ge lerating random 
numbers on a digital computer, of c ourse, is by the computation of a function that approximates 
the random number. Many such functions have been proposed. The implementation presented 
here is based on the linear congruence method, which uses the following equation. 

x(n + l) = (ax(n) + c)modm 

The initial value of x, x(0), is called the seed value and is generally not important, because with 
a good choice of a and c all m values are generated before the output sequence repeats. The 
random number sequence produced by the above equation is thus uniform in the sense that the 
output is uniformly distributed between and m-\. Of course different seed values should be 
used at different times if different sequences are desired. By choosing the modulus m = 2 32 , 
one can ensure a long sequence and have a convenient modulus for the ADSP-2101. 

Listing 5-9 shows the ADSP- 2 1 1 routine used to compute random numbers based on the 
linear congruence method. The va lues of a and c that are used in this program (a = 1664525) 
and c = 32767 were chosen according to the rules given in Knuth [7]. The initial seed value is 
stored in the SR register before the subroutine is called for the first time and the first number 
produced by this routine is the initial seed value. The random numbers are re urned in the AR 
register while the subsequent seed values are in the SR register. Note that, although only the 
most significant 1 6 bits of the 32- Dit x value are used as random numbers in this routine, any 
or all of the bits can be used. Hov/ever when using a value of m equal to the word size of the 
machine, the least significant bits of x(n) are much less random than the mos: significant bits. 
Thus, one should always use the b most significant bits when only a fr-bit random number is 
desired. 



{ ADSP-2101 Random Subprogr ua RANDOM. DSP 

This subroutine calculates uniformly distributed random numbers using a 



linear congruence method 

Calling Parameters 

SRI = MSW of the sued value 
SRO = LSW of the sised value 

Return Values 

AR = The Random Number 
SRI = MSW of the ncsw seed value 
= LSW of the n<sw seed value 



Altered Registers 

MYO, MY1, MR, SI, :3R, AR 

Computation Time 

10*N + 4 cycles 



Note : The first time this routine is called it will return the portion 
of the seed that is in SRI. Subsequent times it will return a 
pseuo-random number in AR. 
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> 



.MODULE/RAM URAND_SUB; 
. ENTRY urand; 



urand : 



MY1=25; 
MY0=26125; 

AR=SR1, MR=SR0*MY1 (UU) ; 

MR=MR+SR1*MY0 (UU) ; 

SI=MR1; 

MR1=MR0 ; 

MR2=SI; 

MR0=H#FFFE; 

MR=MR+SR0*MY0 (UU) ; 

SR=ASHIFT MR2 BY 15 ( HI ) ; 

SR=SR OR LSHIFT MR1 BY -1 (HI) 

SR=SR OR LSHIFT MRO BY -1 (LO) 

RTS; 



( Beginning of RANDOM Subprogram } 
{ Entry point of random function } 

{ Upper half of a } 

{ Lower half of a } 

{ a (hi) * x(lo) ) 

{ a (hi) * x(lo) + a(lo) * x(hi) J 



{ c-32767, left-shifted by 1 } 
{ ( above ) + a(lo) X x(lo) + c } 



. ENDMOD ; 



{ ri 



{ 



shift by 1 } 



Program } 



Listing 5-9: Random Number Generator - 



The routine requires 1(W + 4 cycles to execute, where N is the number of random numbers 
desired. For example, computing 2 16 (65,536) random numbers using a 12.5 MHz ADSP-2101 
takes 52.43 milliseconds. Computing all m=2 n numbers in the sequence requires almost one 
hour. 

EXPERIMENT 4B 

In this experiment we will generate a random sequence at 8KHz sampli ng rate. 

1. Using RANDOM. DSP subroutine write EXPMT-4B.DSP program to generate 
random sequence. Adjust the timer registers to generate a sequence at an 8 Khz 
sampling rate. Output this random sequence to the SPORTO transmit register. 

2. Connect the speaker to the SPORTO transmit register and listen to the output. 
Comment on the auditory qualities of the noise. 

5.5.4 Exercises 

1 . The sinusoidal waveform can also be generated using a second order recursive filter 
given by 

y(n) = .y(/j) + 2cosco v(« - l)-y(« -2) 

When this filter is excited by an impulse, i.e. x(n) = 8(«), it generates; a digital sine 
wave with digital frequency of co rad/sam since it has two poles on the unit circle at 
conjugate angles co . The frequency of the analog sine wave generated by this filter 
is given by 



F = 



2% 



where W, is the sampling frequency. Using the programs developed in Experiment 
3, write a program to generate a 1 KHz tone. Note that one of the filter coefficent 
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can range between -2 and 2. Hence implement a 2.14 fixed-point : r ormat represen- 
tation in your program. Comment on any possible problems with this approach in 
waveform generation. 

2. Write a program to generate a noisy sinusoidal waveform by incorporating both the 
sine and the random number generators. Generate a 2 KHz wavefcrm corrupted by 
noise distributed uniformly between -1 and 1. Implement a 2.14 fixed-point format 
in your program. 

3. To generate random numbers from a Gaussian distribution one can add 12 numbers 
from the [0, 1 ] uniform generator and subtract 6. The mean of the uniform generators 
is 1/2 so that the subtraction of 6 from the sum gives a mean of exactly 0. Since the 
variance of the original uniform distribution is 1/12 the 12 independent numbers give 
a distribution with variance exactly 1 . Write a program to generate Gaussian random 
numbers with mean and variance 1 . 

5.6 SUMMARY 

In this chapter, we described several simple experiments as a vehicle to effectively program 
and use the microcomputer. In the process we also implemented different DSP operations and 
analyzed the results in terms of the expected and achieved outcomes. What we studied is the 
essence of DSP because more complicated operations can be achieved using these simple 
programs. We can now treat this ADSP-2101 microcomputer as a tool to generate more 
sophisticated algorithms and develop useful and interesting application. This, is what we will 
do in the next five chapters. 



= 

chapter 6 



FIR FILTER IMPLEMENTATIONS 



6.1 INTRODUCTION 

Digital signal processing comprises of two important areas: signal filtering in the time domain 
and signal representation in the frequency domain. With the programming background that we 
have developed in the last chapter, we are now in a position to treat the subject of digital filter 
design and implementation on the ADSP-2 1 1 . As described in Chapter 5 , there are two types 
of digital filters namely FIR and IIR. This chapter deals with the design and implementation 
of FIR filters. A similar treatment for IIR filters is given in the next chapter. In Chapter 8, we 
describe signal representation in the form of the Discrete Fourier Transform (DFT) and an 
efficient method for computing the DFT, called the Fast Fourier Transform Algorithm. 

In practice, FIR filters are employed in filtering problems where there is a requirement 
for a linear phase characteristic within the passband of the filters. Using linear phase FIR filters 
it is possible to implement such systems as differentiators and Hilbert transformers, which can 
not be implemented using IIR filters. However, if phase distortion is unimportant then IIR 
filters are generally preferable. In this chapter, we consider FIR filters which are of a 
frequency-selective type, i.e., we will design and implement filters which are multiband filters 
with magnitude response specified in each band. There is another class of filters in which 
frequency-domain characteristics of the desired filters are not specified explicitly . These filters 
include Wiener filters, inverse filters for deconvolution, and equalizers, in which the design 
criterion is specified in terms of minimizing some performance measure in the time-domain. 
Some of these filters will be considered later in Chapter 10. From an implementation point of 
view, however, this classification is unimportant. 

We begin in Section 6.2 with a brief discussion on frequency-selective linear phase FIR 
filters. We describe window design technique and the Parks-McClellan algorithm as two 
representative methods of FIR filter design. In Section 6.3, we describe the implementation of 
FIR filters using single-precision arithmetic. In some applications, the 16-bit precision of the 
ADSP-2101 microcomputer may not be enough. Therefore, we describe a double-precision 
implementation in Section 6.4. Section 6.5 deals with the implementation of all-zero lattice 
filters. Finally, in Section 6.6 we present a single sideband modulator as an example of a 
complete system implementation involving many components besides an FIR filter. 
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6.2 OVERVIEW OF FIR FILTER DESIGN 

FIR filter has several design advantages over an Infinite-duration Impulse Response (IIR) digital 



• it is always stable, 

• it is always realizable, and 

• it can always be designed to have exact linear phase. 

The third advantage makes the FIP. filter an indispensable choice in applications requiring no 
delay distortion but only fixed delay. Additionally, the design problem for linear phase FIR 
filters requires only real arithmetic which is easy to implement. Therefore in this overview, we 
discuss design techniques and examples of linear phase FIR filters. 

As discussed in Chapter 4, an FIR filter is described by the difference equation (or con- 
volution formula) 

y(n) = Zh(k)x(n-k) (6-1) 

<■=() 

where h{n),n = 0, ...,N - 1 is the impulse response of the filter. The frequency response of the 
FIR filter is given by 

N -1 

H(e jm ) = X h(n)e~ ja " 

The magnitude of the frequency response, | H{e i<a ) |, is called the magnitude res ponse while the 
angle, ZH(e J(0 ), is called the phase response. If we impose a linear phase constraint on the phase 
response in the form 

Z.H(e' a ) = (3-aco, u)>0 

then it can be shown [7] that we have the following two solutions: 
• either 

h(n)=-h{N-\-n), 0<n<N-l, 

n . N — I 



\ = 0, and a = 



or 



h{n) = -h{N -\-n), 0<n<N-\, 
fi = +-, and oc = ^— 

The first solution results in the impulse response h(n ) being symmetric about 0".. All multiband 
(lowpass, highpass, bandpass, or bandstop) linear-phase FIR filters exhibit this behavior. The 
second solution implies that the impulse response h(n) is antisymmetric aboat a, a behavior 
exhibited by FIR differentiators and Hilbert transformers. These symmetry conditions can be 
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used to reduce multiplications by about 50% in implementing FIR filters. However in our 
ADSP-2101 implementations, multiplication and addition can be done in one instruction cycle. 
Therefore symmetry conditions are of little importance from implementation point of view. 

The linear-phase FIR filters are specified in terms of their magnitude response. The 
essence of FIR filter design is then to determine a causal impulse response h (n ) to approximate 
the given magnitude specifications subject to symmetry conditions on h (n ). The phase response 
then can be determined from the length and the symmetry of the impulse response. In Cie 
following sections we briefly describe two well known approaches to FIR filter design: window 
design method and Parks-McClellan algorithm. 

6.2. 1 Window Design Method 

The simplest method of designing a linear phase FIR filter is the window design method. The 
given magnitude specifications can be thought of as an ideal filter H D (e jay ) to be approximated. 
For example, an ideal lowpass filter has unity magnitude response and linear phase over the 
passband and zero response over the stopband: 

^>f" ; (6 _ 2) 

[ , CO, <| CO |< 7tJ 

where co, is called the corner frequency. Clearly the impulse response of the ideal filter, hjn), 
is of infinite duration given by 

-71 

To obtain a causal FIR filter of length N, we may truncate hjn), i.e., 

h d (n) , < « < A' - l\ 
, otherwise 

N-l 



h(n) = 

and 

a = 



2 

This operation is called windowing. In general, h(n) can be thought of as being formed by the 
product of hjn) and a window function, w(n), as follows: 

h(n) = ^(«)h>(») (6-4) 

and 

; ; (some symmetric function over < n < N - 1| 
w{n) = \ 

I , otherwise 



Depending upon how w(n) is specified, one obtains different window designs. Some commonly 
used window types and their functions are shown in Table 6-1. In Table 6-2, we provide a 
summary of window function characteristics in terms of transition width (as a function of AO 
and the minimum stopband attenuation (in DB). These tables along with the given specifications 
can be used to obtain the approximated impulse response h(n). In the remainder of this section, 
we describe the window design technique by using examples of some representative digital 
filters. 
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Window Type 


Window Function w(n),0<n <N- 1 


Rectangular 


1 




Bartlett 


In n ^ N-l 
N-V 2 

„ In N-\ 

2 ~N-r 2 


— 1 


Manning 


0.5^1 -cos 


( 2nn Y 
U - 1 h 





Hamming 


0.54- 0.46 cosf^-1 
{N-l) 




Blackman 


„ ,„ „ _ f 2TC« 

0.42 -0.5 cos - — - 
{N-l 


+ 0.08 cos 

• 


f 4m } 



{N-l) 


Kaiser 


where /„ is the modified Bessel funct 


ion of order zero. 



Table 6-1 : Window Functions 



Window Type 


Transition Width 

Aco 


Minimum Sideband 
Attenuation in DB 


Rectangular 


l .871 

N 


21 


Bartlett 



5.671 
N 


25 


Hanning 




6.2tc 

N ' 


44 


Hamming 


6.6tc 

"n 


53 


Blackman 


lln 

N 


74 


Kaiser (0=4.54) 


5.8ji 

Af 


50 


Kaiser 0=5.67) 


7.8« 
N 


60 



Table 6-2: Window Function Characteristics 
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Example 6-1: 

In the first example, we will design a lowpass filter for our EZ-LAB target system. The 
codec in the EZ-LAB target system operates at an 8 KHz sampling rate. Therefore we 
design our filter for the following specifications: 

Sampling frequency: 8 KHz 

Passband: - 1 KHz 

Stopband: 1.4 -4 KHz 

Stopband Attenuation: 50 DB 

From Table 6-2 we observe that both the Hamming and Blackman window functions can 
provide attenuation of more than 50 DB. Let us choose the Hamming window which is 
the better of the two. The required transition width is 0.4 KHz or 

2k -400 

Aco = ^000" = ai7t 

From Table 6-2, using the transition width column, we obtain the filter length N as 

N = £r = 66 

0. lie 

N - I 

We choose N = 67 which gives a = — = 33. The corner frequency of the deal lowpass 



filter is 



^IOOO+14(X)j 



H D (e Je ) = 



2n 

co = — - — = 0.3:r 

8000 

From equation (6-2), the ideal frequency response is 

\\-e- im , | co |< 0.3ti 

, 0.371 <| CO |< 7tJ 

and from equation (6-3), the ideal impulse response is 

sin[0.37t(/i-33)] 

7C(/7 - 33) 

From Table 6-1, the window function for Hamming window is 

w{n) = 0.54- 0.46 cos(27tn/66), 0</j<66 
Finally from equation (6-4), the impulse response of the designed FIR filter is 

= S ' n[a37t(/7 ~ 33)] • [0.54 - 0.46 cos(2tw /66)] , < n < 66 
Kin - 33) 

These FIR filter coefficients are shown in Table 6-3 and the frequency response is shown 
in Figure 6-1. 
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Filter Length = 67 
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h(22) 
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h (60) 
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C3 
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Table 6-3: Example 6-1 Filter Coefficients 




Figure 6-1 : Example 6-1 Frequency Response 
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Example 6-2: 

In this example, we will design a bandpass filter for the following specifications: 

Sampling frequency: 8 KHz 

Lower Stopband: - 0.6 KHz 

Passband: 1 - 2 KHz 
Upper Stopband: 2.4 - 4 KHz 
Stopband Attenuation: 40 DB 

Window Function: Hanning Window 
Following the procedure in Example 6-1, we obtain the following results: 
Transition width = Min[ 1 -0.6, 2.4-2] = 0.4 KHz or Aco = 0. 1 it. Hence 

N = £r = 62 - 

Choose N = 63, then a = 3 1 . 

Corner Frequencies: 0.8 KHz and 2.2 KHz or Aco, , = 0.2tc and Aco, , = 0.557C 
Ideal frequency response: 

\l^ J31a , 0.2ji<|co|<0.557tJ 
, otherwise J 



« (0 = 
Ideal impulse response: 



h d (n) = 



sin[0.557t(« -31)] sin[0.27t(« -31)] 



Hanning window function: 



tc(« -31) 



h(n) 



w(n) = 0.5-0.5cos(27w/62), 0<w<62 
Finally, the FIR filter coefficients are: 

( 2nn N 



sin[0.557t(/? -31)] - sin[0.27i(n - 31)] 



0.5 -0.5 cos 



62 



< n < 62 



n(n-31) 

These FIR filter coefficients are shown in Table 6-4 and the frequency response is shown 
in Figure 6-2. 
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Table 6-4: Example 6-2 Filter Coefficients • 
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Example 6-2 Frequency Response 



Example 6-3: 

In this example, we will design a 40-tap FIR differentiator using a Blackman window. 
Differentiators have antisymmetric impulse responses and their length N must an even 
number so that their frequency response is not zero at co = n. The ideal frequency response 
of a typical differentiator is given by 




H D {e ia ) = L^e-i™, -7i < to < 7i 

7T. 



se response is 
-1 



n 2 (n - a) 2 
h a {n) = \ j 



(n - a) > 
(« - a) < 



l n 2 (n - a) 2 

Note that since N is even, a and hence (n - a) are not integers. In this example, 
a = (40- l)/2= 19.5. Using the Blackman window function from Table 6-1, the FIR 
differentiator coefficients are given by 



h(n) = h/n ) ■ [o.42 - 0.5 cos^j + 0.08 cos^ 



0<« <39 
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These coefficients are shown in Table 6-5 and the frequency response is shown in Figure 
6-3. Note the distortion effect at the high frequency end, resulting from our specification 
that | H D (e Ja ) | be linear out to co = %. 
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Table 6-5: Example 6-3 Filter Coefficients 
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Figure 6-3: Example 6-3 Frequency Response 



Example 6-4: 

In this last example, we will design a Hilbert Transformer which also has an antisymmetric 
impulse response but the filter order N must be an odd number. We will use a Hamming 
window to design a 5 1 -tap Hilbert transformer. The ideal frequency response of the Hilbert 
transformer is given by 
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H D (e"*) 



-j, < CO < 7t j 

l+y, -71 < co< Oj 
From equation (6-3), the ideal impulse response is 

2 



Kin) = \n{n-a) 



(n - a) odd 
(n - a) even. 



Since N = 5 1 , a = 25. Hence (« - a) is an integer. Using the Hamming window function, 
the FIR Hilbert transformer :oefficients are given by 



h(n) = 



n(n - 25) 



0.54- 0.46 cosf^ 



,n =0,2,. ..,50 



These coefficients are shown in Table 6-6 and the frequency response is shown in Figure 
6-4. Note that the Hilbert transformer has nulls in the frequency response characteristics 
at co = and co = 7t. 
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Table 6-6: Example 6-4 Filter Coefficients 
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Figure 6-4: Example 6-4 Frequency Response 
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6.2.2 Parks-McClellan Algorithm 

The window design technique which we discussed in the last section is easy to understand and 
use. However it has some disadvantages. First, we do not have precise control over the passband 
and stopband edge frequencies. Second, and the most important, the approximation error or 
ripples are not uniformly distributed over both passband and stopband intervals. This error is 
higher near the band edges and lower in regions away from band edges. By distributing 
approximation error uniformly over both passband and stopband, one can obtain lower order 
filter satisfying the same specifications. 

Using the design algorithm due to Parks and McClellan [9], it is possible to design FIR 
filters that are optimum in approximating magnitude specifications with minimum peak error 
for the given order or minimum order for the given error. This algorithm is based on the fact 
that the frequency response for a linear phase FIR filter can be expressed as a polynomial in 
cos(to). The problem can then be changed to one of Chebyshev approximation where the best 
approximation to the given magnitude response is known to have equiripple behavior. The 
essence of this algorithm is to determine a polynomial solution so that the maximum value of 
the approximation error is minimized. The Parks-McClellan algorithm incorporates the well- 
known Remez-exchange routine for polynomial solution. The details of this algorithm are 
provided in many textbooks on digital signal processing including in [8]. A computer program 
written by Parks and McClellan [9] is also available for designing linear phase FIR filters which 
is based on their algorithm. Using this program we give designs of FIR filters described in the 
first two examples. Our purpose here is to compare design performance in terms of filter order 
and stopband attenuation. 

Example 6-5: 

Consider the lowpass filter specifications given in example 6- 1 , which are repeated below. 

Sampling frequency: 8 KHz 

Passband: - 1 KHz 

Stopband: 1.4 -4 KHz 

Stopband Attenuation: 50 DB 

When this filter was designed using the Parks-McClellan algorithm, the filter length of 
55 was obtained for stopband attenuation of 50 DB. The filter coefficients are given in 
Table 6-7 and the frequency response is shown in Figure 6-5. 
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Figure 6-5: Example 6-5 Frequency Response 



Consider the bandpass filter specifications given in example 6-2, 
below. 

Sampling frequency: 8 KHz 

Lower Stopband: - 0.6 KHz 

Passband: 1 - 2 KHz 

Upper Stopband: 2.4 - 4 KHz 

Stopband Attenuation: 40 DB 

Window Function: Hanning Window 

When this filter was designed using the Parks-McClellan algorithm, the filter length of 
55 was obtained for stopband attenuation of 40 DB. The filter coefficients are given in 
Table 6-8 and the frequency response is shown in Figure 6-6. 
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Table 6-8: Example 6-6 Filter Coefficients 
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Figure 6-6: Example 6-6 Frequency Response 

After comparing these designs with the corresponding window designs, it is obvious that 
the Parks-McClellan algorithms gives the smallest filter order and equiripple stopband per- 
formance. The smallest filter order is desirable from an implementation point of view. However 
in an ADSP-2101 implementation, we do not pay any penalty for a larger filter order so long 
as it is moderate. On the other hand, the window design technique allows us to design FIR 
filters using simple design equations without resorting to any sophisticated programs. This will 
allow us to concentrate on implementations and programming the ADSP-2101. In practice, 
when a more refined filter design is required, we can change the assembly language program 
by updating the filter coefficients obtained from the Parks-McClellan algorithm. 



6.3 SINGLE-PRECISION FIR DIRECT FORM FILTER 

The realization of an FIR filter can take many forms, although the most useful in practice are 
the Direct Form (or transversal filter) and the lattice structures. The FIR lattice filter is described 
later in Section 6.5. In this section we describe the realization of a single precision FIR Direct 
Form structure. This structure can be obtained directly from equation (6-1) for discrete-time 
convolution which is repeated below. 



y(n) 



X h(k)x{n -k) 

k=0 
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We have already studied this equation and its implementation in section 5.4. A graphic 
representation of this Direct Form structure is shown in Figure 6-7. 



x(n) 



DELAY 



— r» | delay' —i— — r* | delay"] — 




Vh„ 



<+)-. y(n) 



Figure 6-7: FIR Filter - Direct Form Structure 

A Filter Subroutine 



The subroutine that realizes this structure is shown in Listing 6- 1 . The first instruction sets up 
the computation by clearing MR and loading MXO and MYO with the first data and coefficient 
values from data and program memory. The multiply/accumulate with dual data fetch in the 
com loop is then executed N - 1 times in N cycles to compute the sum of the firs t N - 1 products. 
The final multiply/accumulate instruction is performed with the rounding mode enabled to round 
the result to the upper 24 bits of MR. MR1 is then conditionally saturated to its most positive 
or negative value based on the status of the overflow flag MV. In this manner, results are 
accumulated to the full 40 bit resolution of MR, with saturation of the output only if the final 
result overflowed beyond the least significant bits of MR. 



{ Single Precision FIR Direct Form Filter Subroutine 



FIRDIRFM .DSP 



Calling Parameters 

10 -> Oldest Input Data Value, x(n-N+l), in Delay Line 
L0 = Filter Length (N) or Taps 
14 -> Beginning of Filter Coefficient Table 
L4 = Filter Length (N) or Taps 
MO, M4 = 1 

CNTR = Filter Length - 1 (N - 1) 

Return Values 

MR1 = Filter Output y(n) (rounded and saturated) 
10 -> Oldest Input Data Value in Delay Line 
14 -> Beginning of Filter Coefficient Table 

Altered Registers 
MXO, MYO, MR 

Computation Time 

N-l+5+2=N+6 Cycles 

All coefficients and data values are assumed to be in 1.15 format. 



. MODULE f irdirfm_sub; 
. ENTRY fir df; 
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fir_df: MR=0, MX0=DM ( 10 , MO ) , MY0=PM ( 1 4 , M4 ) ; 

DO conv UNTIL CE; 
conv: MR=MR_MX0*MYO (SS) , MX0=DM ( 1 , MO ) , MY0=PM ( 1 4 , M4 ) ; 

MR=MR+MX0 *MY0 (RND) ; 

IF MV SAT MR; 

RTS; 

. ENDMOD ; 

Listing 6-1 : Single Precision FIR Direct Form Subroutine 

The limit on the number of filter taps attainable for a real-time implementation of the 
above filter routine is determined primarily by the processor cycle time, the sampling rate, and 
the number of other computations required. The FIR Direct Form routine given above requires 
a total of N + 6 cycles for a filter of length N; at an 8 KHz sampling rate and an instruction cycle 
time of 80 nanoseconds, this permits a filter of 1,400 taps with 150 instruction cycles for other 
operations. 

6.3.2 An Example Program 

As an application of the above subroutine, we now provide a complete example program to 
implement the lowpass FIR filter designed in Example 6-1. This program is shown in Listing 
6-2. The filter coefficients are given in Table 6-3 and the frequency response is shown in Figure 
6- 1 . These coefficients are available in the disk file FIRDFLP.DAT as 1 . 1 5 hexadecimal format 
numbers. 

This example program (and other programs to follow) uses a subroutine CntlReg inits to 
initialize all control registers as discussed in Chapter 5. This subroutine is also a vailable in the 
disk file CREGINIT.DSP. Assemble and link the main program and two subroutines to create 
an executable file. Load this executable file in the emulator and experiment with speech as well 
as other signals from a signal generator. 



{ Lowpass FIR Filter FIRDFLP . DSP 

Using Single Precision Direct Form Structure 

This program implements a single precision Direct Form FIR filter 
structure using subroutine fir_df available in disk file FIRDIRFM . DSP . 
The filter coefficients used in this program are for a lowpass filter 
of length 81 with passband from to .125 and stopband from .15 to .5. 
It is designed via Parks-McClellan algorithm. These coefficients are 
stored in a disk file, FIRDFLP.DAT. 

and EZ-LAE 



This program is written for EZ-ICE and EZ-LAB system with EZLAB1.ACH 
architecture file. Additionally, this program uses subroutine 
CntlReg_inits to initialize all control registers. It is available in 
disk file CREGINIT.DSP. Assemble FIRDFLP. DSP, FIRDIRFM. DSP, and 
CREGINIT.DSP using ASM21.EXE. Link using LD21.EXE to produce 
FIRDFLP . EXE . Load FIRDFLP . EXE in EZ-ICE and execute. 

} 

.MODULE /RAM/ ABS=0/BOOT=0 FIR_DFLP; 

.PORT write_dac0; 

.PORT load_dac; 

.CONST taps=81; 

.VAR/CIRC data [taps]; 

.VAR/PM/CIRC f ir_coef f [taps] ; 

. INIT fir_coeff: <f irdf lp . dat>; 

JUMP start; RTI; RTI; RTI; {Reset Vector} 

RTI; RTI; RTI; RTI; Urq2} 
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RTI; RTI; RTI; RTI ; 

JUMP sample; RTI; RTI; RTI; 

RTI; RTI; RTI; RTI; 

RTI; RTI; RTI; RTI; 

RTI; RTI; RTI; RTI; 

{ Initialize 

start: CALL CntlReg_inits ; 

I0=~data; M0=1; L0=taps; 

I4=~f ir_coef f ; M4=l; L4=taps; 
CNTR=taps; 
DO zero UNTIL CE; 
zero: DM(I0,M0)=0; 

ICNTL=B#00111; 
IMASK=B#101000; 
wait: IDLE; 

JUMP wait; 

{ Process Input Sample ■ 

sample: SR1=RX0; 

DM(I0,M0)=SR1; 
CNTR=TAPS-1; 
CALL FIR_DF; 
TX0=MR1; 

SR = ASHIFT MR1 BY 1 (HI); 
DM (WRITE_DACO ) =SR1 ; 
DM(LOAD_DAC) =SR1; 
RTI; 



{sportO TX) 
{ sport RX} 
{irqO} 
(irql) 
(timer } 



( set up SPORTS, TIMER, etc } 

(clear the filter delay line buffer} 
{ disable IRQ nesting, all IRQs edge-senstv} 
{ enable IRQ2 and SPORT0_RX interrupt } 

rophone } 
:fer } 





(get new sample fr 
(store sample in 



(Filter current data} 

(filtered output to SPORT (to spkr) } 



(latch sample for D/A } 
(display sample on scope via D/A} 



Listing 6-2: Lowpass Filter Program - 







ises 

Design a highpass FIR filter using the 
following specifications: 

Sampling frequency: 
Stopband: 
Passband: 
Stopband Attenuation: 
Window Function: 



8 KHz 
0- 1.6 KHz 
2-4 KHz 
50 DB 

Hamming Window 



i technique to approximate the 



2. 



; filter in the emulator and verify its operation. 

Consider the FIR differentiator designed in Example 6-3. Create a disk file for the 
filter coefficients using 1.15 hexadecimal format numbers. Implement this differ- 
entiator in the emulator and study its effects on speech signals. 



6.4 DOUBLE-PRECISION FIR DIRECT FORM FILTER 

Many digital filters require a sum-of-products computation using operands that are greater than 
16 bits in magnitude. The subroutine, DFIRDIRF.DSP, described in this section implements 
a sum-of-products calculations using coefficients and data that are both represented in double 
precision (or 32 bits). On the ADSP-2101, this is accomplished through the use of the 
mixed-mode multiply instructions. We provide this subroutine to illustrate programming 
methodology required in double-precision arithmetic. 
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6.4.1 A Filter Subroutine 

The subroutine that realizes the double-precision sum-of-products operation used in computing 
the Direct Form filter is shown in Listing 6-3. First, the sum of products of the low halves of 
the coefficients and the high halves of the data values is computed; this sum is accumulated 
with the sum of the products of the high halves of the coefficients and the low halves of the data 
values. This sum is then shifted right 16 bits and then accumulated with the sum cf the products 
of the high halves of the coefficients and the high halves of the data values. \ conditional 
saturation is then performed on the final 32-bit result before storage to data memory. Note that 
because the result is only the most significant 32 bits, the products of the low-order coefficients 
and the low-order data affect only the least significant bit of the result and are therefore not 
computed. 



{ Double Precision FIR Direct Form Filter Subroutine DFIRDIRF . DSP 

Calling Parameters 

10 --> Oldest input data value in delay line 

L0 = 2 * Filter length (N) 

14 — > 2nd location (LSW of 1st value) 

of filter coefficient table 
L4 = 2 * Filter length (N) 
M0,M4 = 1 
M1,M5 = 2 
M2,M6 = 3 

AXO = Filter length - 2 (N-2) 
CNTR = Filter length - 2 (N-2) 

Return Values 

MR1 , MRO = sum of products 
(conditionally saturated to 32 bits) 
10 --> Oldest input data value in delay line 
14 — > 2nd location (LSW of 1st value) 
of filter coefficient table 

Altered Registers 
MX0,MY0,MR 

Computation Time 

3 * (N-2) +16+9 

All coefficients and data values are assumed to be in 1.15 format. 

} 

.MODULE dfirdirfm_sub; 
.ENTRY dfir_df; 

dfir_df :MR=0, MX0=DM ( 1 , Ml ) , MY0=PM (14, M5 ) ; 

DO hlloop UNTIL CE; 
hlloop: MR=MR+MX0*MY0 (SU) , MX0=DM ( 1 , Ml ) , MY0=PM ( 14 , M5 ) ; 

MR=MR+MX0*MY0 (SU) , MX0=DM ( 1 , M2 ) , MY0=PM ( 1 4 , M4 ) ; 

MR=MR+MX0*MY0 (SU) , MX0=DM ( 1 , Ml ) , MY0=PM ( 1 4 , M5 ) ; 

CNTR=AX0; 

DO lhloop UNTIL CE; 
lhloop: MR=MR+MX0*MY0 (US) , MX0=DM ( 1 , Ml ) , MY0=PM ( 14 , M5 ) ; 

MR=MR+MX0*MY0 (US) , MX0=DM ( 1 , MO ) , MY0=PM ( 1 4 , M5 ) ; 
MR=MR+MX0*MY0 (US) , MX0=DM ( 1 , Ml ) , MY0=PM ( 1 4 , M5 ) ; 
MR0=MR1; {downshift 16 places) 

MR1=MR2 ; 
CNTR=AX0; 

DO hhloop UNTIL CE; 
hhloop: MR=MR+MX0 *MY0 (SS) , MX0=DM ( 1 , Ml ) , MY0=PM ( 14 , M5 ) ; 



MR=MR+MXO *MYO (SS) , MXO=DM ( 10, Ml ) , MY0=PM ( 14 , M6 ) ; 
MR=MR+MX0*MY0(SS) ; 
IF MV SAT MR; 
RTS; 

.ENDMOD; 

Listing 6-3: Double Precision FIR Direct Form Subroutine 

6.4.2 Exercises 

1 . The above double-precision subroutine can be easily adapted to applications requiring 
mixed precision. For example, to use 32-bit coefficients and 16-bit data values, one 
would eliminate the Ihloop and make the corresponding changes in the data memory 
pointer values and the size of the circular buffer. Modify the above subroutine to 
implement the above mixed precision arithmetic. Use the lowpass filter given in 
Table 6-1 to generate a filter coefficient file in 1.31 hexadecimal format. Using the 
subroutine and the filter coefficient file, implement the filter and verify its operation. 

2. Combine both single- and mixed-precision FIR subroutines in a program to study 
coefficient quantization errors. Implement program lines to compute and output the 
error. Determine the error at various frequencies and plot the error response. 



6.5 ALL-ZERO LATTICE FILTER 

The lattice filter is extensively used in digital speech processing and in the implementation of 
adaptive filters. It is a preferred form of realization over other FIR or IIR filter structures because 
in speech analysis and in speech synthesis, the small number of coefficients allow a large number 
of "formants" to be modelled in real time. Its physical analogue is a series of cylinders of 
different radii; each of the filter coefficients represents the amount of energy reflected at a 
boundary of two cylinders. The all-zero lattice is the FIR representation of the lattice filter 
while the all-pole lattice is the IIR representation. The all-pole lattice filters are described in 
Chapter 7. In this section, we describe a subroutine to implement the all-zero lattice filter on 
the ADSP-2101 microcomputer. 

An FIR filter of length N (or order (TV - 1 )) has a lattice structure with (N - 1 ) stages as 
shown in Figure 6-8. 




f N -,(n)=y(n) 



► g„.,(n) 



Figure 6-8: All-Zero Lattice Filter 
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Each stage of the filter has an input and output that are related by the order-recursive equations 
([8]) 

Lin) = m = \,2,.-.,N-\ 

*JM = ^„/„,_ 1 («) + g_,(n-l) m = \,2,...,N-\ 

where the parameters k„„m = 1,2, ...,N - 1, called the reflection coefficients, are the lattice filter 

coefficients. If the initial values of /„,(«) and g„,(n) are both the scaled value of the filter input 
x(n), then the output of the (N - 1) stage lattice filter corresponds to the output of an (N - 1) 
order FIR filter, that is, 

fo(») = go( n ) = tyfr) 

y(n) = fs-M) 

For example, if N = 3 (second-order FIR filter) then the lattice filter equations are: 
/,(«) = k^i^ + k^kfpcin - 1) 



g 2 (n) 



If we focus our attention on 
we obtain 



k i k x(n) + k x(n - 1) 

/i(«) + £ 2 £i( w - 1) 
kjftM + g^n - 1) 

I and substitute for/,(«) and g x {n - 1) ft 



: above equations, 



y(n) = f 2 (n) = kgx(n) + k,k () x(n - \) + k^k^in - \) + k^(n -2)] 
= kgxi^ + kgk^l +k 2 )x(n - \) + k Q k 2 x(n -2) 

Now this equation is identical to the output of the direct form FIR filter given by 

y{n) = h(0)x(n) + h(l)x(n-l) + h(2)x(n-2) 
If we equate the coefficients, we have 

h(0) = k h(\) = k k l (\+k 2 ) h(2) = k k 2 

or, equivalently, 

h{2) , h{\) 



k 2 = 



A(0) 



h(0) + h(2) 
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Thus the reflection coefficients k„, ,m = 0,1,2 can be obtained from the impulse response h(n), 
n = 0,1,2 of the FIR filter. A similar analysis can be done to demonstrale the equivalence 
between an Mh-order direct form FIR and an jV-stage lattice filter which is given in [8]. 

The ADSP-2101 implementation of the all-zero lattice filter is shown in Listing 6-4. This 
subroutine computes the entire output sequence given the input sequence of finite length stored 
in a buffer. Therefore this subroutine is different from the previous subroutines in which only 
one output sample was computed per subroutine call. There is also another important difference 
in the implementation. In the direct form structure, all the intermediate results were stored in 
the MR register which is 40-bits wide. Therefore, overflow was not an issue and only the final 
result was checked for overflow. In the lattice filter implementation, the intermediate signal 
values / m 's and g m 's cannot be stored in the MR register. These signal values must also be 
protected from an overflow since an addition is required in their calculation. However, a sat- 
uration logic may lead to a disaster since these values are used several times in the output 
calculations. Therefore the desirable approach is to provide sufficient dynamic range for 
calculations. This must be determined based on the filter coefficients and the filter order. 

Before this subroutine is called, several registers must be pre-loaded. The index register 
10 should contain the starting address of the input buffer, and 12 should hold the starting address 
of the output buffer. The index register 13 should contain the starting address of the filter delay 
line, and 14 should contain the starting address of the coefficient buffer. The length registers 
L0 and L2 should be set to 0, but L3 and L4 should be set to the order of the filter (or number 
of stages) to make the delay line and coefficient buffers circular. The modify register M3 should 
be set to one, and the SE register should contain the value needed to maintain a valid output 
data format for sufficient dynamic range. (For example, if two 1.15 numbers are multiplied, 
the product is a 1.31 number To obtain a product in 4.28 format, the SE register should be set 
to -3.) The multiplier feedback register, MF, should contain the value one in the output format. 
Multiplication by MF is an alternative method of converting the output to the correct format. 
The CNTR register should contain the number of locations in the output buffer (or the number 
of output samples to be computed). 

The out-lp loop is executed once for each output data point. The CNTR register is loaded 
with the order of the filter, and the first input data point is loaded into MXO. The lattlp loop 
performs the filtering operation on the input data point. The first multiplication in the latt lp 
loop formats the f m .,(n) value into the MR register and also reads in values for g m _ ] (n - 1) and 
k m . These values are then multiplied and accumulated to produce f m (n), at the same time the 
value g m _ ,(/7 ) is stored in the delay line for the next pass. The value /„,(« ) is reformatted in the 
shifter for use by the multiplier in the next pass of the latt lp loop. Next, g„, _ , (n - 1 ) is formatted 
into the multiplier to compute the value of g m (n). This value is then accumulated with the 
product of k m and /„,(«)• Again, the shifter reformats the value before storage. 

{ All-Zero Lattice Filter Subroutine FIRLATFL . DSP 

Calling Parameters 

CNTR = Length of Output Buffer 

10 — > Input Buffer LI = 

12 — > Output Buffer L2 = 
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13 — > Delay Line Buffer (circular) L3 = Filter Order 

-> Coefficient Buffer (circular 



14 --> Coefficient Buffer (circular) L4 = Filter Order 
MO = 1 
M2 = 
M3 = 1 
M4 = 1 

SE = Appropriate Scale Factor 
MF = Formatted 1 

Return Values 

Output Buffer Filled 

Altered Registers 

MXO,MX1,MYO,MF,MR,SR, 12, 13, 14 

All coefficients and data values are assumed to be in 1.15 format. 

Computation Time 

(8 * Filter Order + 4) * Output Buffer Length +3+1 cycles 

i 

.MODULE f irlatf l_sub; 
. ENTRY fir_lf; 

fir_lf: SR1=0; {Clear SRI for first pass} 

DO out_loop UNTIL CE; {Loop output lenjth} 

CNTR=L3 ; 

MX0=DM(I0,M0) ; {Get x(n)} 

DO latt_loop UNTIL CE; {Loop through filter} 

MR=MX0*MF (SS), MX1=DM ( 13, M2 ) , MY0=PM ( 14 , M4 ) ; {Get ij, k} 
MR=MR+MX 1 * MY (SS) , DM ( 13 , M3 ) =SR1 ; {Compute fm store g} 
SR=ASHIFT MR1 (HI); {Reformat fm} 

SR=SR OR LSHIFT MRO (LO) ; 

MR=MX1*MF (SS); {Format gm-1} 

MX0=SR1, MR=MR+MX0 *MY0 (SS); {Compute gm and Hold fm} 

SR=ASHIFT MR1 (HI); {Reformat gm} 

latt_lp: SR=SR OR LSHIFT MRO (LO) ; 

out_lp: DM (12, MO) =MX0; {Save output} 

RTS; 

. ENDMOD ; 



Listing 6-4: All-Zero Lattice Filter Subroutine - 



6.6 SINGLE SIDEBAND (SSB) MODULATOR 

As an interesting project on FIR filtering, let us consider Single Sideband (SSB ) modulation. 
This is a well known modulation technique in analog communication systems. We will simulate 
this technique in the ADSP-2101 system for speech signals. The project incoroorates many 
earlier modules that we studied before, including a sinusoidal waveform generator, a Hilbert 
transformer, a delay-line, etc. Therefore in this project, we will write a program to implement 
a complete system and analyze its performance. 

SSB modulation was developed to address two problems in Amplitude Modulation (AM), 
namely wasteful transmitted power and transmission bandwidth. Since the upp>er and lower 
sidebands of AM are uniquely related by symmetry about the carrier frequency, given the 
amplitude and phase of one we can always reconstruct the other. Suppressing the carrier and 
one sideband overcomes the shortcomings in AM. This leads to SSB modulation. There are 
several approaches to achieving SSB modulation. One approach which is appropriate for our 
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purpose involves the use of a Hilbert transformer, which is a all-pass filter that imparts a 90° 
phase shift on the signal at its input. For an arbitrary signal x(t), it can be shown that [10] the 
SSB modulated signal x c (t) is given by 

Xe (A = ^[x(t)cos(2nF c t)+x(t)sm(2nF c t)] (6-5) 

where the upper sign is taken for upper SSB and vice versa, F c is the carrier frequency, and 
where x{t) is the Hilbert Transform of x(t). A block diagram of the SSB modulator based on 
the above equation is shown in Figure 6-9. We will now see how to implement this analog 
modulator using the ADSP-2101 DSP system. 



sx(t) 



Figure 6-9: Single Sideband Modulator 
6.6.1 A Frequency Shifter 

When the upper sign is used in equation (6-5), we obtain the upper sideband of the signal shifted 
in frequency by F c amount. In this mode, the SSB modulator functions as a frequency shifter. 
Listing 6-5 shows a program which shifts an speech signal by 2 KHz. It is available on the disk 
as FIRDFSSB.DSP file. 

The frequency shifter implemented in this program uses the 5 1 -tap FIR Hi lbert transformer 
designed in Example 6-4. The filter is implemented using subroutine fir_df from Section 6.3.1 
and it uses two circular buffers: data and fir_coeff. Due to the linear-phase characteristics, this 
filter imparts a 25-sample shift on the input signal. Therefore the input signal in the upper 
branch of Figure 6-9 must also be delayed by the same amount for proper synchronization. This 
is done by using a circular buffer called sigdelay. We need two digital oscillators, one for the 
cosine waveform and the other for the sine waveform. This can be achieved by using one 
oscillator along with a delay line as a phase shifter. A digital oscillator is implemented using 
the sine subroutine developed in Chapter 5. The circular buffer cos sin performs the function 
of two oscillators. In this project, the sampling frequency is 8 KHz while the c arrier frequency 
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is 2 KHz. Therefore a delay of two samples provides a phase-shift of 90 ° . This can be achieved 
by the cos sin buffer of length 3. Finally, a circular buffer angles stores 8 angle values to 
generate a 2 KHz waveform. The rest of the program is self explanatory. 



{ Single Sideband Modulator 

Using Single Precision Direct Form Structure 



FIFDFSSB.DSP 



This program implements a Single Sideband (SSB) Modulator using Hiibert 
Transform Approach. The Hiibert Transformer is implemented i_sing a 
51-tap single precision FIR filter. The coefficients of this filter 
are available in FIRDFHT.DAT disk file. The program also uses 
subroutine sin to generate cosine and sine components of carrier 
frequency at 2 KHz . This subroutine is available in disk file 
SINE. DSP. 



} 



This program is written for EZ-ICE and EZ-LAB system with EZLABl.ACH 
architecture file. Additionally, the program uses subroutine 
CntlReg_inits to initialize all control registers. This subroutine is 
available in disk file CREGINIT . DSP . Assemble FIRDFSSB.DSP, SIN. DSP, 
FIRDIRFM . DSP and CREGINIT. DSP using ASM21.EXE. Link using LD21.EXE to 
produce FIRDFSSB.EXE. Load FIRDFSSB.EXE in EZ-ICE and execute. 



.MODULE/RAM/ABS=0/BOOT=0 



FIR SSBM; 



.CONST taps=51; 
. CONST alpha=2 5; 

.VAR/DM/CIRC data[taps]; 
.VAR/PM/CIRC fir_coef f [taps] ; 

.VAR/DM/CIRC sig_delay [alpha] ; 

.VAR/PM/CIRC angles [8] ; 

.VAR/DM/CIRC cos_sin[3]; 
. INIT fir_coeff: <f irdf ht . dat>; 

. INIT angles: 0x000000, 0x200000, 0x400000, 0x600000, 

0x800000, OxAOOOOO, OxCOOOOO, OxEOOOOO; 
( 



RTI; 



JUMP START; RTI; RTI; 
RTI; RTI; RTI; RTI; 
RTI; RTI; RTI; RTI; 
JUMP SAMPLE; RTI; RTI; RTI 
RTI; RTI; RTI; RTI; 
RTI; RTI; RTI; 
RTI; RTI; RTI; 

( Initialize 

start: CALL CntlReg_inits ; 

I0="data; M0=1; 
Il="sig_delay; M1=0; 
I2="cos sin; 



RTI; 
RTI; 



(Reset Vector] 

(Irq2] 

(SportO TX] 

(SportO RX] 

(IrqO) 

{Irql} 

(Timer) 



M3=l; 
M4=l; 
M5=0; 



I4= A f ir_coef f ; 
I5=~angles; 
CNTR=taps; 
DO zerol UNTIL CE; 
zerol: DM(I0,M0)=0; 

CNTR=alpha; 
DO zero2 UNTIL CE; 
zero2: DM (II, M0) =0; 

ICNTL=b#00111; 
IMASK=b#101000; 
IDLE; 

JUMP wait; 

{ Process Input Sample - 

sample: SR1=RX0; 

DM(I0,M0)=SR1 
MY0=DM(I1,M1) 
DM(I1,M0)=SR1 



L0=taps; 

Ll=alpha; 

L2 = 2; 

L3=0; 

L4=taps; 

L5=8; 



{ set up SPORTS, TIMER, e":c } 



(Clear the filter delay line buffer) 



(Clear the signal delay line buffer] 
disable IRQ nesting, all IRQs edge-senstv] 
{ enable IRQ2 and SPORT0_RX interrupt ] 



id SPORT0_RX : 



(Get new sample from microphone ) 
(Store sample in data buffer ) 
(In phase component of signal in MY0 } 
(Store sample in sig_delay ouffer] 



} 
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AX0=DM(I5,M4) ; 
CALL sin; 
DM(I2,M0)=AR; 
MR1=AR*MY0 ; 
SR=ASHIFT MR1 BY 
AX0=SR1; 
CNTR=taps-l; 
CALL fir_df; 
MY0=DM(I2,M1) i 
MR1=MR1*MY0; 



1 (HI); 



(Angle from angles buffer in AXO} 
(Cosine value in AR } 

(Store cosine value in c:os_sin buffer) 
(product of cos and in-phase in MR1 } 
(scale by 0.5) 
(scaled product in AXO) 



SR=ASHIFT MR1 B' 

AY0=SR1; 

AR=AX0-AY0; 

IF AV SAT AR; 

TX0=AR; 

RTI ; 



1 (HI); 



(Quadrature component in MR1 ) 

(Sin value in MYO) 

(product of sin and quad in MR1 } 

(Scale by 0.5} 

(Move to AYO } 

(Upper SSB modulated signal in AR} 



(modulated signal to SPORT (to spkr) } 



. ENDMOD; 



Listing 6-5: A Frequency Shifter Program 



Assemble, link, and execute this program in the emulator. Be careful to use proper 
architecture and filter coefficient files. Connect a function generator to the microphone port 
and apply a tone of 1 KHz. Observe the output on a scope and verify that it is a 3 KHz tone. 
Experiment with speech signals and characterize their output from the speaker. 

6.6.2 Exercises 

1 . The above SSB modulator shifts the input signal, which has bandw dth up to 3.4 KHz 
(due to anti-aliasing filter in the codec), by 2 KHz. The frequency shifted output 
signal therefore will be distorted due to aliasing above 4 KHz (at an 8 KHz sampling 
rate). We should have implemented a lowpass filter to restrict frequency components 
below 2 KHz. Modify the above program to first include a lowpass filter with cutoff 
frequency of 1 KHz. Choose the proper specifications for the lowpass FIR filter. 
Also modify the program so that the upper SSB is available at DACO and the lower 
SSB is available at DAC1 . Experiment with speech signals and listen to the outputs 
from both SSBs. Is the output from the lower SSB intelligible? If not, why not? 

2. A single Sideband modulator can also be implemented by first mu tiplying the input 
signals by a carrier (which is AM) and then extracting the upper or lower SSBs using 
a sharp highpass or lowpass filter, respectively. Write a program to simulate a 
frequency shifter based on this idea. Prefilter the input signals up to 1 KHz. Use a 
carrier frequency of 2 KHz and a stopband for FIR filters beginning at 2 KHz. 
Compare the performance of this shifter with the one above. 



Design and implementation of FIR digital filters is a very important operation in DSP. In this 
chapter, we first briefly discussed two useful design techniques. The window design is a simple 
technique while the Parks-McClellan algorithm is an optimum one. Using simple programs 
developed in Chapter 5, we then discussed implementation of FIR filters using single- and 
double-precision arithmetic. We also introduced another FIR filter structure called the all-zero 
lattice filter and described its implementation. Finally as an exercise in a complete system 
design and implementation, we developed a SSB modulator as a frequency shifter. In the next 
chapter, we will provide a similar treatment for IIR digital filters. 



6.7 SUMMARY 
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7.1 INTRODUCTION 

In this chapter, Infinite-duration Impulse Response (IIR) filters are designed and mplemented 
on the ADSP-2 1 1 microcomputer. Compared to FIR filters, IIR filters can often be much more 
efficient in terms of attaining better magnitude response with a given filter order. This is because 
IIR filters incorporate feedback and are capable of realizing both poles and zeros of a system 
function, whereas FIR filters are only capable of realizing the zeros. This means that IIR filters 
can run faster and hence must be considered in applications where speed is important. They, 
however, have stability problems and have nonlinear phase characteristics which might make 
them unsuitable for some applications. FIR filters, on the other hand, are always si able and can 
be designed to have exact linear phase. 

When it comes to implementation, there are more trade-offs that must be considered when 
choosing between FIR and IIR filters. As discussed in Chapter 6, FIR filters are generally 
implemented using a linear convolution. This implementation does not pose any problems on 
finite word-length processors, such as the ADSP-2101, because all filter coefficients can be 
suitably scaled to avoid saturation and overflow. IIR filters are implemented using difference 
equations as described in Chapter 5. One approach is to compute two convolution sums as two 
FIR filters over the past few inputs and outputs and take the difference to obtain the next output 
value. This approach has stability problems due to coefficient quantization and r' the order is 
large then the filter actually can become unstable. Therefore in most cases, IIR filters are 
factored into second-order sections to minimize this sensitivity, and then the entire filter is 
implemented as a cascade or parallel network of these sections. Still each factor mus : be carefully 
implemented to avoid overall numerical overflow problems. There are other problems which 
stem from the fact that this filter is recursive. Even if the input is zero, the filter can enter into 
a bad mode in which there is some output called limit cycles. These and other implementation 
issues will be discussed in this chapter. 

In the first half of the chapter, we briefly discuss the realization and design issues of IIR 
filters. We then present computer programs for three important structures on the ADSP-2101 
microcomputer with emphasis on trouble-free implementations. 
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7.2 IIR FILTER STRUCTURES 

Infinite-duration Impulse Response filters form a wider class of filters whose z-domain system 
function H(z) is a rational function in z. This class includes FIR filters as a special case. 
However, we treated FIR filters separately for design as well as for implementation purposes. 
The system function of an IIR filter is given by 



N 



Y(z) 



au; - „ \-a t z -...-a N z 



= 



where b„ and a„ are the coefficients of the filter. We have assumed that both the numerator and 

denominator polynomials are of order N . The order of such an IIR filter is called N \fa N ^ 0. 
The difference equation representation of an IIR filter was discussed earlier in Chapter 5. It is 
expressed as: 

N N 

y(n) = lb m x(n-m) + la m y(n-m) (7-2) 

m =0 m = 1 

This representation implies that the output of an IIR filter is a function of the past outputs and 
the present and past inputs. This feedback (or recursive) nature of the IIR filter is a major 
challenge in its implementation especially on a digital signal processor as we shall see. 

There are three different structures than can be used to implement an IIR filter. 

• Direct Form: In this form, the difference equation (7-2) is implemented directly as 
given. Since there are two parts to this filter, namely FIR and recursive (or equivalently, 
the numerator and denominator parts), this implementation leads to tw o versions, Direct 
Form I and Direct Form II structures. 

• Cascade Form: In this form, the system function H(z) in equatior (7-1) is factored 
into smaller second-order section, called biquads. The system function is then repre- 
sented as a product of these biquads. Each biquad is implemented in a direct form and 
the entire system function is implemented as a cascade of biquad sections. 

• Parallel Form: This is similar to the cascade form but after factorization, a partial 
fraction expansion is used to represent H(z) as a sum of smaller second-order sections. 
Each section is again implemented in a direct form and the entire system function is 
implemented as a parallel network of sections. 

We will briefly discuss the direct and cascade form in this section. The parallel form, due 
to its design, makes sense when more processors are available to implement all sections 
simultaneously. In this book, we will not discuss multiprocessor implementation. Therefore, 
the parallel form is of little use in our IIR implementations. 



as given using memory, 
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7.2.1 Direct Form 

As the name suggests, the difference i 



tion (7-2) is 

multipliers and adders. For the purpose of illustration, let H = 3. Then the d 
is 

y(n) = b x{n) + b x x{n-\) + b^x{n-2) + b 3 x{n-3) + b A x{n-A) 
+a t y(n - l)+a 2 y(n -2) + a 3 y(n -3) + a 4 y(n -4) 

which can be implemented as shown in Figure 7- 1 . This block diagram is called Direct Form 
I structure. 



x(n) 



© ©> 









: - 1 



b 4 



y(n) 

► 



c 



Figure 7-1: Direct 




The direct form I structure implements each part of the rational function H(z) separately 
with a cascade connection between them. The numerator, or FIR part, is a tapped delay line 
followed by the denominator, or recursive part, which is a feedback tapped delay line. Thus, 
there are two separate delay lines in this structure and hence it requires 8 memory elements. 
We can reduce this memory count or eliminate one delay line by interchanging the order in 
which the two parts are connected in the cascade. Now the two delay lines are close to each 
other, connected by a unity gain branch. Therefore one delay line can be removed and this 
reduction leads to a canonical structure called Direct Form II structure. It is shewn in Figure 
7-2. It should be noted that both direct forms are equivalent from the input-output point of view. 
Internally, however, they have different signals. 
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x(n) 







w(n) - b 

3 



b, 



— <^ — ' — -t> 



y(n) 
— ► 



Figure 7-2: Direct Form II Structure 



In section 7-4, we will mainly discuss the implementation of the direct form II structure 
on the ADSP-2101 because it is more efficient in terms of memory usage and speed. In Chapter 
5, we briefly discussed direct form I implementation under the differenc; equation imple- 
mentation topic. 

7.2.2 Cascade Form 



In this form, the system function H(z) is written as a product of second order sections. In the 
remainder of this chapter, we assume that N is an even number. Then 



-N 



H(z) = 



b + b t z 1 + — Yb N z 
l-a,z-' a N x~ N 



= b n 



1 + — z + ••• + — z 



1 — a x z 



-a N x 



b m l+B^z-'+g^z- 2 



(7-3) 



where M is equal to and B n , B 2t , A lk , and A 2k are real numbers which are the coefficients of 
second order sections. The second order section 
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Ul , Y k + l (z) \+B, k z~ x +B 2k z- 
H k (z) = ^7— = ; ; k=0,...,M 



with 



Y k (z) \-A lk z~ l -A 2k z- 



7,(z) = bj[(z), Y M + ] (z) = Y(z) 



is called the k' h biquad section. The input to the k' h biquad section is the output from the (k - 1 ) th 
biquad section while the output from the k' h biquad is the input to the (k + l) Ih biquad. Now 
each biquad section H t (z) can be implemented in direct form II as shown in Figure 7-3. The 
entire filter is then implemented as a cascade of biquads. 



y k (n) 




y k+ i(n) 



Figure 7-3: Biquad Section Structure 

As an example, consider N = 4. Figure 7-4 shows a cascade form structure for this 4th 
order IIR filter. 

x(n) - b 




Figure 7-4: Cascade Form Structure for N = 4 

In section 7-5, we will discuss the implementation of the cascade form structure on the 
ADSP-2101 . From a finite word-length point of view, this implementation has advantages over 
the direct form as we shall see. But first, we briefly discuss the design of IIR filters in the next 
section. 
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7.2.3 Lattice Structure 

In Section 6.5, we described a lattice filter structure that is equivalent to an FIR filter. In this 
section we present a similar structure for all-pole IIR filters. The implementation of IIR filters 
that contain both poles and zeros requires lattice-ladder structure which is not described here. 

Consider an all-pole filter described by the system function 



H(z) = 



b 



1 + I a m z- 

m = 1 



The difference equation for this filter is 



y(n) = £ a nJ( n ~ m ) 

m = I 



(7-4) 



If we interchange x{n) with v(n) in equation (7-4), we obtain 

N 

x(n) = b y(n)- I a m x{n -m) 

m i 1 



or, equivalently, 



1 1 " 

y(n) = — jt(«) + — I a m x(n-m) 

D O m = l 



We note that the above equation describes an all-zero (FIR) filter for which we have 
presented a lattice structure in Section 6.5. The lattice structure for the all-zero filter can now 
be used to obtain a lattice structure for an all-pole IIR filter by interchanging the roles of the 
input and output. We take the all-zero lattice filter illustrated in Figure 6-8 and redefine the 
input as 

x(n) = f N (n) 

and the output as 

y(") = bj a {n) 

These are exactly the opposite of the definitions for the all-zero lattice. The quantities {/„(«)} 
likewise are also computed in opposite order and the resulting set of equations is given by ([8]) 



f N (n) = x(n) 

fm-M = />)-*„,£„,-!(« -1) m =N,N-\,...,1 

gjn) = *„/„,-,(") + #„,-,(« -1) m=N,N-l,...,l 

V(n) = Vo(") = Vo(") 



(7-54) 

O-M 

(7-5d) 



which corresponds to the structure shown in Figure 7-5. 
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Figure 7-5: Lattice Filter Structure 



To demonstrate that the set of equations (7-5) represent an all-pole IIR filter given in 
equation (7-4), consider the case in which N = 2. The equations above reduce to 



m 


= x(n) 


/,(«) 


= f 2 {n)-k 2 g,{n-\) 






/o(«) 


= fi{n)-k x g B {n-\) 




= kj (n) + g a (n -1) 


y(n) 


= b f o (n) = b o g o (n) 



After some simple substitution and manipulations we obtain 

y(n) = b^{n)-b k x {\+k 2 )y{n-\)-b k 2 y(n-2) (7-6) 
Comparing the equation (7-4) with (7-6), we note that the two representations are equivalent if 

a, = b ( Jc l (l+k 2 ) a 2 = b k 2 

A similar analysis can be done to establish the equivalence between an Nth-order direct form 
all-pole IIR filter and an N-stage all-pole lattice filter ([8]). 

In Section 7-6, we will discuss the implementation of the all-pole lattice filter on the 
ADSP-2101. 

7.3 OVERVIEW OF IIR FILTER DESIGN 

IIR filters have long (infinite-duration) impulse responses. Hence they can be matched to analog 
filters, all of which generally have infinitely long impulse responses. Therefoie the basic IIR 
filter design technique transforms well known analog filters into digital filters using 
complex-valued mappings. The advantage of this technique lies in the fact that both analog 
filter design tables and complex-valued mappings are extensively available in literature. This 
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basic technique is called Filter Transformation. However the analog filter design tables are 
available for lowpass filters only. One would also like to design and implement other frequency 
selective filters such as highpass, bandpass, or bandstop filters. To obtain these other filters, 
one needs to apply Spectral Transformations. These transformations are complex-valued 
mappings that are also available in literature. 

There are two approaches to this basic design technique in which a digits 1 filter is obtained 
from an analog lowpass filter. In the first approach, an analog spectral transfo-mation is applied 
to the lowpass filter to obtain another frequency selective filter. Then a filter transformation is 
applied to obtain the desired digital filter. In the second approach, first a lowpass digital filter 
is obtained from the corresponding analog filter through a filter transformation Then the desired 
frequency selective digital filter is obtained using a filter-band transformatior,. In this book we 
will explain the second approach, while in most software packages on filter design the first 
approach is used. 

Given the frequency domain filter specifications, the essence of IIR filter design is to 
obtain the filter of order N, and the filter coefficients a k , h k (for direct form structures) or A jk , 
5,t (for cascade form structure). Therefore the following steps are required in this design: 

• Using the given digital specifications, obtain the corresponding lowpass analog filter 
specifications. This step requires inverse filter and spectral transformations on critical 
band or cutoff frequencies. 

• Design an analog lowpass filter. This step requires use of well known analog filter 
design tables or formulas. 

• Apply the filter transformations to obtain a digital lowpass filter. There are two useful 
transformations available; impulse invariance and bilinear, out of which the later one 
is a better choice. 

• Apply a spectral transformation to obtain the desired digital filter from the lowpass 
filter. The result of this step is the system function H{z) in the form of equation (7-1). 

• Finally, if the cascade form structure is desired then factor //(z ) to obtain biquad sections 
in the form of equation (7-3). 

The main problem with this technique is that we have no control over the phase response 
which ideally should be linear in the passband for most applications. This is because analog 
filter design equations are based on magnitude-squared frequency response. For the precise 
linear-phase response, one must consider FIR filters discussed in Chapter 6. 

7.3. 1 Analog Lowpass Filter Design 

The IIR filter design techniques that we are discussing rely on existing analog lowpass filters 
(LPF) to obtain the desired digital filters. We will call these analog filters prototype filters. 
There are three prototypes which are widely used in practice. These are: Eutterworth LPF, 
Chebyshev LPF, and Elliptic LPF. Butterworth filters have a flat response in both passband 
and stopband, while Chebyshev filters have either equiripple passband and flat si opband (Type-I) 
or flat passband and equiripple stopband (Type-II). In this section, we will briefly summarize 



Sec. 7.3 OVERVIEW OF IIR FILTER DESIGN 



185 



design equations to obtain system functions of Butterworfh and Chebyshev Type- 1 filters given 
the frequency domain specifications. The elliptic filters exhibit equiripple behav or in both the 
passband and stopband, and hence they are optimum in that they achieve the minimum order 
N for the given specifications. However, they are very difficult to analyze. It is not possible 
to design them using simple tools and often computer programs are needed to design them. 
Therefore we will not discuss them in this book since the lower order of the ellipl ic filter is not 
an important issue in our implementations. 

Analog filters are specified in terms of their frequency-domain transfer function given by 



where subscript a denotes an analog quantity, and Q is the analog frequency. Analog prototype 
filters are characterized in terms of their magnitude-squared response given by 

|//„(/Q)| 2 = ff.yQJff.WO) = H a (s)H a (s)\ s __ jn 

where HJs) is the s-domain system function. Thus, 

H tl (s)H a (-s) = \ Ha Un)\ 2 \^_ js (7-7) 

Therefore the poles and zeros of the magnitude-squared system function are distributed in a 
mirror-image symmetry with respect to the jQ. axis in the s-plane. A causal and stable H a (s) 
can now be obtained from this distribution by assigning left-half plane poles and zeros of 
H a (s)H a (-s) to H a (s). The resulting filter is called a minimum-phase filter. 

The magnitude-squared specifications on prototype filters are given in terms of parameters 
shown in Figure 7-6. These parameters are: 

• passband cutoff frequency Cl P in rad/sec, 

• stopband cutoff frequency Q s in rad/sec, 

• passband ripple R P in dB, and 

• stopband attenuation A s in dB 

ers, we would like to determine H a (s) in order to obtain the prototype filter. 
-10log l0 |H a (j£2)i 2 





Q (rjid/sec) 



Figure 7-6: Analog Prototype Filter Specifications 



IIR Filter 



Butterworth Filter Design 

This filter is characterized by the property that its magnitude response is flat in both passband 
and stopband. The magnitude-squared response of an Nth-order lowpass fi Iter is given by 



\H a (jO)\- 



1 



-Off 



(7-8) 



where Q ( is the 3-dB cutoff frequency. The system function H a (s) can be obtained from equation 
(7-6) by substituting Q = —js to obtain 

OA) 2 " 

The roots of the denominator polynomial or the poles of H a (s)H a (-s) are gi^en by 



s k = Q,.exp 



Now a stable and causal filter H a {s) can be obtained 
resulting in the system function in the form: 



,*=0,1,...,2/V-1 (7-9) 
; poles in ihe left-half plane 



H a (s) = 



n (s-s k ) 

LHP poles 



(7-10) 



Design Equations 



Given Cl F , Q s , R P , and A s , two parameters are required to specify the Butterworth filter: 
the order N and the 3-dB cutoff frequency Q ( . We want 



-10 log) 



1 



= *r 



and 



1 



-10 -log fa s \2N 
Solving these two equations for N and £2, yields 



= A, 



N = 



log[(10" p " u -l)/(10 V '°-l)] 
2 • log(Q P /Q s ) 



(7-11) 



and 
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or 



a = 



a = 



(10 vl0 -lf 



(io v "'-iy' 



(7- 12a) 



(7-12*) 



The first value in the above equation satisfies specifications at Q. P exactly while the second 

value satisfies specifications at Q s exactly. Now using equations (7-9), (7-10), (7-11), 
and (7-12), we can determine the system function H a (s) of a Butterworth filter. 

Chebyshev Filter Design 

There are two types of Chebyshev filters; one has equiripple behavior in the passband (Type-I) 
and the other has equiripple behavior in the stopband (Type-II). These two types are closely 
related and the design equations are essentially the same for both types. Therefore we will 
consider only Type-I filter called Chebyshev-I. The magnitude-squared response of an Nth- 
order Chebyshev-I filter is given by 



\H a unr = 



i 



i+e 2 r. 



(7-13) 



where e is a passband ripple factor related to K t , and T N (-) is an Nth order Chebyshev polynomial 
given by 



1 Q 



a. 



cos 



cosh 



N cos 
N cosh 



'ay 



£2 



< Q < £2, 

n < n < oo 



The equiripple behavior of the Chebyshev filters is due to this polynomial. 

A causal and stable H a (s) can be obtained by determining poles of H a (s)H a (—s) and 
selecting the left-half poles for H a (s). The poles are obtained by finding the roots of 



= o 



It can be shown [8] that, poles fall on an ellipse in the s-plane with the minor axh oc£2 along the 
real axis and major axis pQ along the imaginary axis, where 



a = IQ-y*) 



(7- 14a) 
(7 -14ft) 
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1)71 



If s t = <J( + jQ. k represents a pole of H a (s), then 

, [n (2k + 

Now the system function is given by 

H a (s) = 



(7 -14c) 



2N 



1)TC 



k=0,\,...,N-\ 
k=Q,l,...,N-\ 



n 

LHP poles 



(7- 15a) 
C7-151>) 

(7-16) 



where K is a normalizing factor chosen to make 



HM = 



N odd 
N even 



(7-17) 



Design Equations 



Given fi,,, Q s , and A x , three parameters are required to specify a Ciebyshev-I filter: 

the order N, the 3-dB cutoff frequency £2,., and the ripple factor e. Equations for these 
parameters are given without proof. 



e 

= Q P 



and Q, A — 
Lip 



4 20 J 



A A 10^ 
g A V(A 2 -l)/e 2 
logfe+Vg 2 -l] 



log!Q,. + VQ 2 -lj 



(7 -18a) 
(7-18&) 

(7 -18c) 
(7-1M) 

(7- 18c) 



Now using equations (7-14), (7-15), (7-16), (7-17) and (7-18), system function H a (s)cm 
be determined. 
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7.3.2 Filter Transformations 

The next step in IIR filter design is to transform the analog filters into digital filters. There are 
several complex-valued transformations that are available in the literature. The most popular 
among these are the Impulse Invariant Transformation and the Bilinear Transformation. The 
impulse invariant mapping preserves the impulse response of the analog filter by specifying the 
digital filter's impulse response to be a sampled version of the continuous-time response. As 
a result, the frequency response of the digital filter is an aliased version of the frequency response 
of the analog filter. The bilinear transformation is a one-to-one mapping which eliminates the 
aliasing problem by translating the analog frequency response to a digital system function with 
a comparable frequency response. Therefore in this section, we will briefly discuss the bilinear 
transformation. 



It is defined by the following mapping from s to z . 



2 1 -z" 1 

s = v\ 1 (7-19) 

T 1 + z 1 

where T is a parameter. The inverse mapping is given by 



z = 



1 



(7-20) 



y responses, let z = e' a and s = jQ. in equation (7-20). Then 

i+Jt 

1 ~J~ 

Solving for co as a function of £2, we obtain 

co = 2tan _1 l — I 

The above equation gives a nonlinear compression effect between the analog frequency 
co and the digital frequency co. This is called warping of the analog frequencies, which also 
means that there is no aliasing. The entire imaginary axis in the s-plane is mapped uniquely 
onto the unit circle in the z-plane. This warping, however, is not a problem because digital 
frequencies can be pre-warped to obtain analog frequencies prior to designing analog filters. 
This pre-warping is given by 
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Design Procedure 

The design procedure to obtain an IIR lowpass digital filter can be summarized in the 
following steps. Given the design specifications: 

• Passband cutoff frequency (H P with ripple R P 

• Stopband cutoff frequency co s with attenuation A s 

1 . Choose the parameter T and pre-warp the passband and stopband cutoff frequencies 
using equation (7-21) to obtain the corresponding Q. P and Q s . This parameter T can 
be arbitrary and we will set it to 1 . 

2. Design an analog prototype filter H a (s) using I 
either the Butterworth or Chebyshev-I filter. 

3. Set 

f , 

H{z) = H e 



ions in Section 7.3.1 for 



1 +-.- 



and simplify to obtain a rational H(z) system function. This representation gives the 
direct form structures. 

If desired, factor //(z) into a product of biquad sections for the cascade form structure. 



7.3.3 Spectral Transformations 

In the last two sections, we described design techniques for digital lowpass filters. Certainly 
we would like to design other types of frequency selective filters such as hi.ghpass, bandpass, 
bandstop, or even other lowpass filters. This is accomplished by transforming the frequency- 
band of a lowpass filter so that it behaves like another frequency-selective filter. These trans- 
formations on the complex variable z are very similar to the bilinear transformation and the 
corresponding design equations are algebraic. 

Let H LP (Z) be the designed lowpass digital filter and let H(z) be the desired digital filter. 

Note that we are using two frequency variables, Z and z, with H LP and H respectively. A 
mapping of the form: 



z-' = 



G(z"') 



transforms 

H LP (Z)\ . , -»i/(z) 

if it is a valid mapping with proper parameters. The general form of the function G( ) is of an 
allpass filter type given by 



G(z"') 



A z 

±n 



Sec. 7.3 OVERVIEW OF IIR FILTER DESIGN 



191 



where | a k \< 1 for stability. By appropriately choosing n and the corresponding a k 's, we can 

obtain a variety of spectral transformations. The most widely used transformations are shown 
in Table 7-1. 



Type of 
Transformation 



Transformation 



Parameters 

Z = e ia ',z=e" s ' 
0)' ( . : cutoff frequency of H LP (Z) 



Lowpass 



— > 



z -a 
1 -az' x 



co, = cutoff frequency of new filter 

_sin[(co' c -co c )/2] 
a ~ sin[(co', + co ( )/2] 



Highpass 



— > - 



z +a 
l+az~ ] 



co,. = cutoff frequency of new filter 
cos[(co', + co, )/2] 



a =- 



cos[(co', - co, )/2] 



Bandpass 



Z 1 



z -a,z +a 2 
a 2 z~ 2 -a i z~ l + 1 



co, = lower cutoff frequency 
C0„ = upper cutoff frequency 

a 2 = (K-\)/(K + ]) 



a = 



cos[(co„ + co,)/2] 
cos[(co„ - co,)/:g 



K = cot 



co„ - co, co', 



— — tan — 
2 2 



Bandstop 



Z 1 



z 2 -a,z 1 +a 2 
a 2 z~ 2 -a,z~' + 1 



co, = lower cutoff frequency 
co„ = upper cutoff frequency 
a x =-2aJ{K+\) 

a 2 = (\-K)l{\+K) 

cos[(co„ + co,)/2] 

a = 

cos[(co„ - co,)/2] 

co„ -co, CO', 
K = tan - — tan — 
2 2 



Table 7-1 : Spectral Transformations for Digital Filters 
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7.3.4 Examples 

In the following examples, we will design lowpass filter specified in Example 6-1 using both 
Butterworth and Chebyshev-I prototypes. This will allow us to compare FIR and HR filter 
designs in terms of their orders and frequency responses. We will also examine differences 
between the two prototype designs. 

Example 7-1: 

In the first example, we will design the lowpass filter specified in Example 6- 1 . The 
design specifications are: 



Sampling frequency: 8 KHz 

Passband: - 1 KHz 
Stopband: 1.4 -4 KHz 
Passband Ripple: 1 DB 
Stopband Attenuation: 50 DB 
We will use a Butterworth filter as the analog prototype and the bilinetir transformation 
method for digital filter design. The digital cutoff frequencies are: 



co P = 2n\ ^ I = 0.257T, co 5 



= 2n(^) = 0.35, 



K 8 y 

Following the design procedure outlined in section 7.3.2, the design steps are: 

1. Pre- warped the analog frequencies: Using equation (7-21) with T = 1 we obtain, 



£l P = 2 tan 



G9jg 

y 



.82843, 



O, = 2tan(^ 



= 1.22560 



2. Analog filter design (Butterworth Prototype): Using the design equations from 
Section 7.3.1 for the Butterworth filter, we obtain 

• from equation (7-1 1), N = 17, and 

• from equation (7-1 2b), Q c = .8735558 which satisfies the s topband specifi- 
cations exactly but exceeds those in the passband. 

The poles of H u (s) can now be computed from equation (7-9). These poles are 
shown in Table 7-2. 



-8.06016E-02 ± j 8.69829E-01 

-2.39060E-01 ± j 8.40208E-01 

-3.89377E-01 ± j 7.81975E-01 

-5.26435E-01 ± j 6.97113E-01 

-6.45566E-01 ± j 5.88511E-01 

-7.42712E-01 ± j 4.59868E-01 



8.14566E-01 ± j 3.15565E-01 
8.58682E-01 + j 1.60515E-01 
8.73556E-01 + j 0.00000E+00 



Table 7-2: Poles of Analog Prototype Filter in Example 7-1 
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• The gain in equation (7- 
This completes the analog 

3. Bilinear transformation: B 




H{2) 



1.00448 



-i \ 



1+z" 



we obtain a rational H(z) function whose numerator and i 
are shown in Table 7-3. 



or coefficients 



Filter Order : 17 



Gain Factor 



7.41033E-C 



Numerator Coefficients 
(Highest Order (in z) First) 


Denominator Coefficients 
(Highest Order (in zi First) 


1 . 


000000E+00 


1 


000000E+00 


1 . 


700000E+01 


-8 


219959E+00 


1 . 


360000E+02 


3 


325991E+0 L 


6. 


800000E+02 


-8 


7215E0E + L 


2 . 


380000E+03 


1 


650530E+02 


6. 


188000E+03 


-2 


382394E+02 


1 . 


237600E+04 


2 


707250E+02 


1 . 


944700E+04 


-2 


467582E+02 


2 . 


430900E+04 


1 


822617E+02 


2 . 


430900E+04 


-1 


095289E+02 


1 . 


944700E+04 


5 


347435E+0 1 


1 . 


237600E+04 


-2 


106390E+0 L 


6. 


188000E+03 


6 


605379E+00 


2 . 


380000E+03 


-1 


612977E+00 


6. 


800000E+02 





295961E+00 


1. 


360000E+02 


-3 


846800E-02 


1 . 700000E+01 


3 


145218E-0 3 


1. 


000000E+00 


-1 


373291E-03 



Cascade form structure: Factoring H(z) into a product of 
obtain the biquad coefficients shown in Table 7-4. 



sections we 





Filter Order 




17 


c 


;ain Factor 


7.41333E-09 








B lk 










2 


0000 


1 


. 0000 


-1 


272990 





8732049 


2 


0000 


1 


.0000 


-1 


131914 





6656117 


2 


0000 


1 


.0000 


-1 


024236 





5071641 


2 


0000 


1 


.0000 


-0 


942488 





3868720 


2 


0000 


1 


.0000 


-0 


881346 





2969001 


2 


0000 


1 


.0000 


-0 


837061 





2317379 


2 


0000 


1 


. 0000 


-0 


807070 





1876033 


2 


0000 


1 


.0000 


-0 


789700 





1620010 





0000 


1 


. 0000 


1 


000000 


-0 


.392002 



Table 7-4: Cascade Form System Function in Example 7-1 
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This completes the specified filter design in which the filter coefficients are obtained 
for both the direct and cascade form realization. Magnitude and log-n agnitude plots of 
this filter are shown in Figures 7-7 and 7-8 respectively. Figure 7-9 shows the phase 
response while Figure 7-10 depicts the pole-zero pattern. 



Butterworth 



Filter 



1.2 ■> 








Frequency in KHz 

Figure 7-7: Butterworth Lowpass Filter: Magnitude Response 




Frequency in KHz 

Figure 7-8: Butterworth Lowpass Filter: Log-Magnitude Response 
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Butter worth Loupass Filter 
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180 




Frequency in KHz 

Figure 7-9: Butterworth Lowpass Filter: Phase Response 
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Figure 7-10: Butterworth Lowpass Filter: Magnitude Response 



Example 7-2: 

In this example we will design the same lowpass digital filter as in Example 7-1 , by using 
a Chebyshev-I prototype. The design specifications are given in Example 7-1. We will 
also use the bilinear transformation. The digital cutoff frequencies are s ame as before: 



= 0.257C, 



= 0.3571 



The design steps are: 



1 . Pre- warped the analog frequencies: 



Q„ = 2 tan 



CO,, 



= .82843, Q s = 2 tan 



V J 



2) 



= 1.22560 



Analog filter design (Chebyshev-I Prototype): Using the design equations from 
Section 7.3.1 for the Chebyshev-I filter, we obtain 

• from equation (7- 1 8a), e = 0.50885, 

• from equation (7-1 8b), Q c = .82843 and Q r = 1.47942, 

ft] 

• from equation (7- 18c), A = 10 U " J = 316.22776, 



-18c), A 



• from equation (7- 18d),£ = V(A 2 -l)/e 2 = 621.45265, and 

• from equation (7-1 8e), N = 8. 

The poles of H a {s) can now be computed from equations (7-14) and (7-15). These 
poles are shown in Table 7-5. 



-2.90018E-02 ± 8.25487E-01 

-8.25901E-02 ± 6.99814E-01 

-1.23605E-02 ± 4.67601E-01 

-1.45802E-02 ± 1.64200E-01 



Table 7-5: Poles of Analog Prototype Filter in Example 7-2 

• The gain K in equation (7-16) is obtained from equation (7-17) as 
K = ~ n | s k \ 2 = 0.03162 

Vl + E 2 LHP poles 

This completes the analog filter design. 
3. Bilinear transformation: Setting 

H(z) = H 



we obtain a rational H(z) function whose numerator and denominator coefficients 
are shown in Table 7-6. 

Cascade form structure: Factoring H (z ) into a product of second-order sections we 
obtain the biquad coefficients shown in Table 7-7. 
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Filter Order : 8 Gain Factor : 6.71517E-06 



Numerator Coefficients 
(Highest Order (in z) First) 


Denominator Coefficients 
(Highest Order (in z> First) 


1 . 00O00OE+00 


1.000000E+00 




8.000000E+00 


-6.065304E+0C 




2.800000E+01 


1.719893E+01 




5 . 600000E+01 


-2.953864E+01 




7.000000E+01 


3.345596E+01 




5. 600000E+01 


-2.552769E+01 




2.800000E+01 


1 .230249E+01 




8.000000E+00 


-3.858937E+00 




1.000000E+00 


0.535854E+00 





Table 7-6: Direct Form System Function in Example 7-2 



Filter Order 



Gain Factor : 6.715176E-06 



2.0000 
2.0000 
2.0000 
2.0000 



1.0000 
1.0000 
1.0000 
1.0000 



-1.3829 
-1.4516 
-1.5930 
-1.7065 



Table 7-7: Cascade Form System Function in Example 7-2 ■ 



0. 9516 
0.8631 
0.7909 
0.7482 



i filter design in which the filter coefficients are obtained 
for both the direct and cascade form realizations. Magnitude and log-magnitude plots of 
this filter are shown in Figures 7-1 1 and 7-12 respectively. Figure 7-13 shows the phase 
" ; Figure 7-14 depicts the pole-zero pattern. 
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Frequency in KHz 

Figure 7-11 : Chebyshev-I Lowpass Filter: Magnitude Response 
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Figure 7-14: Chebyshev-I Lowpass Filter: Magnitude Response 

Comparing these examples with those in Chapter 6, we make several observations. IIR 
filters generally have much lower order and sharper transition band than FIR filters, which make 
them attractive from an implementation point of view. However, IIR filters exhibit a nonlinear 
phase response. IIR filter designs also vary markedly from the analog prototypes used. For a 
given set of specifications, Butterworth filters have a higher order than Chebyshev filters, while 
Elliptic filters are the optimum in this respect. From the phase plots, it is clear that the Chebyshev 
filter has more nonlinear response than the Butterworth filter, while the elliptic filter has the 
most nonlinear phase response. 

Even though it is possible to design other frequency selective filters using the methods 
outlined in this chapter, several excellent computer programs are currently available for use on 
personal computers to design such filters. Therefore, we will now focus on implementing 
designed filters on the ADSP-2101 microcomputer. 



7.4 DIRECT FORM IIR FILTER 

As discussed in section 7.2, realization of an IIR filter can take many forms. The most useful 
of those in the ADSP-2101 implementation are the Direct Form and Cascade Form structures. 
In this section, we describe a subroutine for the single precision direct form structure. This 
form is easier to implement than the cascade form because the scaling mechanism to avoid 
overflow and increase dynamic range is simpler to understand. S imilarly the effect of limit-cycle 
oscillations is well understood. In the cascade form, each biquad section is implemented using 
the direct form. Therefore it is necessary to study the direct form subroutine. 
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Of the two possible implementations of the direct form structure, the Direct Form II is 
the most widely used form since it contains only one delay line. We begin wi :h the description 
of a direct form II subroutine followed by an example program. The emphasis in the example 
will be on understanding the scaling operation. 

7.4. 1 A Direct Form II Subroutine 

The direct form II structure is described by equation (7- 1 ) (or (7-2)) and is sho wn in Figure 7-2. 
It has two parts: a feedforward (or FIR like) part and a feedback (or recursive i part. It can best 
be analyzed using an intermediate signal w(n) shown in Figure 7-2. In terms of this signal, 
equation (7-1) can be written as 

Y(z) W(z) 

H{2) = W)W) 

where 

Y(z) 
W(z) 

is the feedforward section, and 

W (z) 1 



b + fo,z 1 H \-b N z N 



and 



X(z) l- 0l z- ] a N z* 

section. The corresponding difference equations are 

y(n) = b w(n) + ai w(n-l)+...+b N w(n-N) (7-22) 



w(n) = x(n) + a\W(n-\) + ...+a N w(n-N) (7-23) 



Therefore the only signal that needs to be stored is w(n ) which requires a single circular buffer. 
The subroutine that implements these two equations is shown in Listing (7-1). Before the 
execution of the subroutine, the input value, x(n), is in register MR1 and the signal w(n) values 
along the delay line are shifted by one sample in the circular buffer. In the subroutine, the 
sum-of-products of a values and the delay line values are computed first according to equation 
(7-22). This replaces the previous value of w (/? ) by the current update. Next, the sum-of-products 
of the b values and the delay line values, with w (n ) updated, are computed acco rding to equation 
(7-23) and the result stored in MR1 which, at the conclusion of the subroutine, provides the 
output value y(n). Finally, the circular buffer index register 10 is modified so that the delay 
line current location is w(n - 1). 

The above subroutine requires a total of 2N + 1 1 cycles for a filter of order /V. Atan8KHz 
sampling rate and an instruction cycle time of 80 nanoseconds, this permits i filter of length 
700 with 150 instruction cycles for other operations. 
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{ IIR Direct Form II Filter Subroutine IIRDFMII.DSP 

Calling Parameters 

MR1 = Input sample ( x[n] ) 
MRO = 

10 --> Delay line buffer current location ( w[n-l] ) 
LO = Filter length 

15 --> Feedback coefficients (a's) 
L5 = Filter length - 1 

16 — > Feedforward coefficients (b's) 
L6 = Filter length 

MO = 
M1,M4 = 1 
M2 = 2 

CNTR = Filter length - 2 
AXO = Filter length - 1 

Return Values 

MR1 = output sample ( y[n] ) 

10 --> delay line current location ( w[n-l] ) 

15 --> feedback coefficients 

16 --> feedforward coefficients 

Altered Registers 
MX0,MY0,MR 

Computation Time 

(N - 2) + (N - 1) +10+4 cycles (N = M = Filter order) 

All coefficients and data values are assumed to be in 1.15 format. 

} 

.MODULE iirdfmii_sub; 
. ENTRY iir_dfii; 

iir_dfii: MX0=DM ( 10 , Ml ) , MY0=PM ( 1 5 , M4 ) ; 

DO poleloop UNTIL CE; 
poleloop: MR=MR+MX0 *MY0 (SS) , MX0 = DM ( I , Ml ) , MY0=PM (15, M4 ) ; 

MR=MR+MX0*MY0 (RND ) ; 

IF MV SAT MR; 

CNTR=AX0; 

DM(I0,M0)=MR1; ( Update w(n) } 

MR=0, MX0=DM(I0,M1) , MY0=PM ( I 6 , M4 ) ; 
DO zeroloop UNTIL CE; 
zeroloop: MR=MR+MX0*MYO (SS) , MX0=DM ( 1 , Ml ) , MY0=PM ( 1 6, M4 ) ; 
MR=MR+MX * MY (RND) ; 
MODIFY (10, M2); 
RTS; 

. ENDMOD ; 

Listing 7-1: IIR Direct Form II Subroutine 

7.4.2 An Example Program 

As an example of the above subroutine, we provide a complete example program to implement 
the lowpass filter designed in Example 7-1. This program is shown in Listing 7-2. The filter 
coefficients are given in Table 7-3 and the magnitude response is shown Figure 7-7. 

The most important issue in IIR filter implementation is the proper scaling of the coef- 
ficients to avoid overflow at the MAC output after every sum-of-products calculations. On the 
other hand, excessive scaling to avoid overflow may result in inefficient use of the available 
dynamic range. Therefore the filter coefficients must be carefully scaled. This problem is not 
an important one in the case of FIR filters because these filters are generally designed to have 
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{ Lowpass IIR Filter 

Using Direct Form II Structure 



I IRDF2LP . DSP 



This program implements a single precision Direct Form II IIR filter 
structure using subroutine iir_dfii available in disk file IIRDFMII . DSP . 
The filter coefficients used in this program are for the Lowpass filter 
described in Example 7-1. These coefficients are stored Ln disk files: 
IIRDFLPA.DAT and IIRDFLPB.DAT. 

This program is written for EZ-ICE and EZ-LAB system with EZLAB1 . ACH 
architecture file. Additionally, this program uses subroutine 
CntlReg_inits to initialize all control registers. It is available in 
disk file CREGINIT . DSP . Assemble IIRDF2LP . DSP , IIRDFMII . DSP, and 
CREGINIT.DSP using ASM21.EXE. Link using LD21.EXE to produce 
IIRDF2LP.EXE. Load IIRDF2LP.EXE in EZ-ICE and execute. 



. MODULE/RAM/ ABS=0 /BOOT=0 



.PORT write_dacO; 

.PORT load_dac; 

. CONST length=18; (N = 17} 

.VAR/CIRC data [length] ; 

.VAR/PM/CIRC a_coeffs [length] ; 

.VAR/PM/CIRC b_coeffs [length] ; 

.VAR/DM/CIRC delay_line [length] ; 

. INIT a_coeffs: <I IRDFLPA . DAT>; 

. INIT b coeffs: <I IRDFLPB . DAT>; 



IIR DF2LP; 



{- 



JUMP START; RTI; RTI ; RTI; 
RTI; RTI; RTI; RTI; 
RTI; RTI; RTI; RTI; 
jump sample; RTI; RTI; RTI; 
RTI; RTI; RTI ; RTI; 
RTI; RTI; RTI; RTI; 
RTI; RTI; RTI; Re- 
initialize 



[Reset Vector} 

{ irq2 } 

{ sport TX} 

( sport RX} 

{irqO} 

{irql} 

(timer } 



start : 



CALL CntlReg_inits; 
IO="delay_line; M0= 



I5 =,s a_coef f s; 
I6="b coeffs; 



M4 = 
M5= 



( set 
0; L0=length; 
1; L4=length; 
1; L6=length; 
=1; 

=2; 



, TIMER, etc } 



zero : 



wait : 



AXO=length-l; 
CNTR=length; 
DO zero UNTIL CE; 
DM(I0,M1)=0; 
CNTR=length-2; 
ICNTL=B#00111; 
IMASK=B#101000; 
IDLE; 

JUMP wait; 

{ Process Input Sample - 

sample: MR1=RX0; 

CALL iir_dfii; 

TX0=MR1; 

SR = ASHIFT MR1 BY 1 (HI); 

DM (write_dacO) =SR1; 

DM(load_dac) =SR1; 

RTI; 



( clear out the filter delay line buffer } 

( disable IRQ nesting, all IRQs ecge-sensitive } 
{ enable IRQ2 and SPORTO^RX interrupt } 

{ get new sample from SPORT0 (frorr microphone) } 
filtered output to SPORT (to spkr) } 



( latch sample for D/A } 

{ display sample on oscilloscope 



. ENDMOD; 



Listing 7-2: Example Program using Direct Form II Structure - 
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maximum gain equal to one at any frequency. Although the frequency domain unity gain 
condition does not absolutely preclude overflow, it can make overflow acceptably improbable 
while using the available dynamic range more efficiently. Furthermore, the output of an FIR 
filter is never fed back, thus any error due to overflow does not affect subseque:it outputs. An 
IIR filter is also designed to have maximum unity gain at any frequency. However, it has 
feedforward and feedback parts and each part is implemented separately as a sum-of-products 
calculation. Now there is no guarantee that each part will have a maximum gain equal to one. 
Therefore direct use of the filter coefficients from the design (e.g. such as those from Table 7-3) 
can result in severe overflow problems. Moreover, because some of the values are fed back 
into the filter calculations, the error can corrupt the subsequent output values. 

Referring to equations (7-22) and (7-23), it is obvious that we have to protect the inter- 
mediate signal w{n) from overflowing at each n. The ADSP-2101 processor, in its default 
mode, implements 1.15 fixed-point format arithmetic. Therefore the maximum value of data 
is 1. Equation (7-22) is an FIR filter and hence its gain is very easy to calculate. Consider 



I y(n) I = 



X b t w(n —k) 

4=0 



< I \b t \\w(n-k)\ < I \b„\ 

4=0 4=0 



since | w{n -k) |< 1. Since the output values also satisfy \y(n)\< 1, we must have 



I \K\ * 1 

4=0 



From Table 7-3, all b k values are positive and their sum is equal to 131,068. Hence these 

coefficients must be scaled by 7.62962x10 6 to avoid overflow in the feedforward path. These 
scaled coefficients in 1.15 hexadecimal format are available in disk file IIRDFLPB.DAT. An 



alternate ; 



cale the b, values so that 



\B{e ,a )\, 



where B(e ;m ) is the discrete-time Fourier transform of b k 



The gain for the feedback part is not so easy to calculate. We can compute the impulse 
response of the feedback section and then use the above approach. However this will be tedious 
and in fact may lead to overscaling. The easier method is to use the alternate approach in 
frequency domain. From equation (7-23), the magnitude response of the feedback section is 



W(e ja ) 
X(e ja ) 



1 



1 + I a k e- Jm 

4 = 1 
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This response can be computed using a computer program. The a k coeffic ients can now be 
scaled so that the maximum value of the above response is 1 . For the values given in Table 
7-3, this maximum gain is 9.0 1 303x 1 2 . Hence these coefficients must be scaled by 1 . 1 095x 1 3 . 
Another approach would be to look at the overall gain factor (7.41033xl0 9 ) g ven in Table 7-3. 
Since we know the gain for the FIR part (7.62962xl0 6 ), the gain for the feedback part is 
0.97 1 25x 1 0" 3 , which agrees with the earlier value. The scaled a k coefficients in 1 . 1 5 hexadecimal 
format are available in the disk file IIRDFLPA.DAT. 

7.4.3 Limit Cycles in Direct Form Structures 

The overflow problems described above, if not properly treated, can cause severe problems 
known as large-scale limit cycles. If a large value results in a sum-of-products calculation of 
the intermediate signal w{n), then in a two's-complement arithmetic, what should have been a 
large positive number may instead become a large negative number. This laige number is fed 
back causing another large number with sign reversal. Once these large oscillations start they 
are difficult to stop without turning off the hardware. However, a proper scaling discussed in 
the above section can virtually eliminate these large-scale limit cycles. Moreover, an instruction 
to saturate the result after overflow detection can prevent sign reversal. 

There is another type of limit cycle which is completely different from i he one above. It 
results from the quantization effects in calculations and is called small-scale limit cycles. When 
the input signal is zero, every stable filter's output decays to zero over time. When this output 
is smaller than the quantization step, it may get rounded up to the previous levjl causing steady 
oscillations with small amplitudes. When there is a significant amount of input, these limit 
cycles tend to go away or at least are not noticeable. This is not a severe problem however, but 
in audio applications, it can result in an irritating tone when there is no input. 

7.4.4 Exercises 

1 . Write a subroutine to implement the Direct Form I structure. This will require two 
circular buffers, one each for input and output. Assume that the numerator and 
denominator orders of H (- ) are different. Use properly scaled filter coefficients from 
example 7-1 and verify its operation. 

2. Implement the lowpass Chebyshev-I filter designed in Example 7-2 in the Direct 
Form I structure. Properly scale coefficients using ideas developed in Section 7.4.2. 
Verify its operation and compare this filter with the Butterworth filter. 

3. Limit cycles can easily be simulated by causing overflow to occur. Modify the 
IIRDFMII.DSP subroutine by eliminating the saturation logic. Now try to cause an 
overflow either by underscaling the a k coefficients or increasing the level of the input 
signal. You should be able obtain an annoyingly large tone. This is the large-scale 
limit cycle. To simulate a small-scale limit cycle, turn off the inpul and observe the 
output signal. Since the output doesn't always enter into this limit cycle, you will 
have to try turning off the input several times to observe one. 
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7.5 CASCADE FORM IIR 

The direct form structure suffers from many practical problems even though it executes faster. 
The coefficients and data must be scaled all at once, which gives rise to larger numerical errors. 
The poles of the single-stage direct form get increasingly sensitive to quantization errors. These 
problems can be alleviated if a cascade of second order biquad IIR filter sections are used in 
the implementation of IIR filters. The biquad implementation executes slower but generates 
smaller numerical errors. The biquads can be scaled separately and then cascaced in order to 
minimize the coefficient quantization and the recursive numerical errors. 

7.5. 1 A Cascade Form Subroutine 

A second-order biquad IIR filter section is shown in Figure 7-3. Its system function in the 
z-domain is given by 

Y(z) B {) + B s z- { +B 2 z~ 2 
(Z) X(z) l-A,z-'-A 2 z- 2 

The corresponding difference equation for a biquad section is 

y(n) = B x(n)+B l x(n-\)+B 2 x(n-2)+Aj(n-l)+A 2 y(n-l) 

which can also be obtained from Figure 7-3. 

An ADSP-2101 subroutine that implements a cascade structure is shown in Listing 7-3. 
The subroutine is arranged as a module and is labeled biquadsub. There are a number of 
registers that need to be initialized in order to execute this subroutine. It may b; sufficient to 
do this initialization only once (e.g., at powerup) if other executed algorithms do not need these 
registers. In most typical cases, however, some of these registers may need to be set every time 
the biquad_sub routine is called. It may sometimes be beneficial, from a mocular software 
point of view, to always initialize all the setup registers as a part of this subroutine. 

The biquad _sub routine takes its input from the SRI register. This register must contain 
the 16-bit input x(n). x(n) is assumed to be already computed before this subroutine is called. 
The output of the filter is also made available in the SRI register. 

After the initial design of a high order IIR filter, all coefficients must be scaled down in 
each biquad stage separately. This is necessary in order to conform to the 16-bit fixed-point 
fractional number format as well as to ensure that overflows will not occur in the final 
multiply-accumulate operations in each stage. The scaled-down coefficients are the ones that 
get stored in the processor's memory. The operations in each biquad are performed with scaled 
data and coefficients are eventually scaled up before being output to the next one. The choice 
of a proper scaling factor depends greatly on the design objectives, and in some cases it may 
even be unnecessary. 
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During the initialization of the biquad_sub routine, the index register 10 points to the data 
memory buffer that contains the previous error inputs and the previous biquad section outputs. 
This buffer must be initialized to zero at powerup unless some nonzero initial condition is 
desired. The index register II points to another buffer in data memory that contains the individual 
scale factors for each biquad. The buffer length register LI is set to zero if the filter has only 
one biquad section. In the case of multiple biquads, LI is initialized with the number of biquad 
sections. The index register 14, on the other hand, points to the circular program memory buffer 
that contains the scaled biquad coefficients. These coefficients are stored in the order: 
S 2 ,B|,S ,A 2 ,ancL4| for each biquad. All of the individual biquad coefficient groups must be 
stored in the same order that the biquads are cascaded in. The buffer length register L4 must 
be set to the value of (5 x number of biquad sections). Finally, the loop counter register CNTR 
must be set to the number biquad sections since the filter code will be executed as a loop. 

The core of the biquadsub routine starts its execution at the biquad label. The routine 
is organized in a looped fashion where the end of the loop is the instruction labeled sections. 
Each iterations of the loop executes the computations for one biquad. The number of loops to 
be executed is determined by the CNTR register contents. The SE register is loaded with the 
appropriate scaling factor for the particular biquad at the beginning of each loop iterations. After 
this operation, the coefficients and the data values are fetched from memory in the sequence 
that they have been stored. These numbers are multiplied and accumulated unti all of the values 
for a particular biquad have been accessed. The result of the last multiply/accumulate is rounded 
to 16 bits and upshifted by the scaling value. At this point the biquad loop is executed again, 
or the filter computations are completed by doing the final update to the delay line. The delay 
lines for data values are always being updated within the biquad loop as well as outside of it. 

The filter coefficients must be scaled appropriately so that overflows occur after the 
upshifting operation between the biquads. If this is not ensured by design, it may be necessary 
to include some overflow checking between the biquads. 



{ Cascaded Biquad IIR Filter Subroutine IIRCASFM.DSP 

Calling Parameters 

SRI = input sample 

10 — > delay line buffer for x(n-2), x(n-l), y(n-2), y(n-l) 
LO = 

11 --> list of scale factors for each biquad section 
LI = (in the case of a single biquad) 

LI = number of biquad sections 

14 — > scaled coefficients b2, bl , bO, a2 , al , b2 , bl , bO , a2 , al , ... 
L4 = 2.5 * filter order — or — 5 * number of biquad sections 
M0,M4 = 1 
Ml = -3 

M2 = 1 (in the case of multiple biquads) 
M2 = (in the case of a single biquad) 
M3 = (1 - length of delay line buffer) 
CNTR = number of biquad sections 

Return Values 

SRI = output sample 
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10 --> inside delay line buffer 

11 --> top of scale factor list 
14 — > top of coefficients 

Altered Registers 

MX0,MX1,MY0,MR, SE, SR 

Computation Time 

8 * number of biquad sections +5+5 cycles 
All coefficients and data values are assumed to be in 1.15 fDrmat. 

} 

.MODULE biquad_sub; 
. ENTRY biquad; 
biquad: IO=~delayline; 

DO sections UNTIL CE; 
SE=DM(I1,M2) ; 

MX0=DM(I0,M0) , MY0=PM(I4,M4) ; {get x(n-2), B2 } 

MR=MX0*MY0 (SS) , MX1=DM ( 1 , MO ) , MY0=PM ( 1 4 , M4 ) ; {get x(n-l), Bl} 

MR=MR+MX1 *MY0 (SS) , MY0=PM ( 14 , M4 ) ; {get BO} 

MR=MR+SR1*MY0 (SS) , MX0=DM ( 1 , MO ) , MY0=PM ( 14 , M4 ) ; {get y(n-2), A2 } 
MR=MR+MX0 *MY0 (SS) , MX0=DM ( 1 , Ml ) , MY0=PM ( 14 , M4 ) ; {get y(n-l), Al } 
DM(IO,MO)=MX1, MR=MR+MX0*MY0 (RND) ; {store x(n-l) as new x(n-2)} 
sections: DM (10, MO) =SR1 , SR=ASHIFT MR1 (HI); {store x(n) as new x(n-l)} 

DM(I0,M0)=MX0; 

DM(I0,M3)=SR1; 

RTS; 

. ENDMOD ; 

Listing 7-3: IIR Cascade Form Subroutine 

7.5.2 An Example Program 

As an example of the above subroutine, we once again consider the lowpass filter designed in 
Example 7-1. The appropriate coefficients for this program are the cascade form coefficients 
shown in Table 7-4. Note that B coefficients for all biquad sections are equal to 1 in our 
formulation according to equation (7-3). 

The scaling issue in the cascade form is even more trickier than the direct form. As 
discussed earlier, each biquad section must be scaled appropriately to avoid overflow of 
intermediate signals (the total number of which is equal to the number of biquads). Each biquad 
is in fact a direct form IIR filter. Therefore the same analysis of scaling described in section 
7.4.2 must be performed for each section. The FIR part is relatively easy and since each B k is 
positive and their sum is equal to 4, each B k must be scaled by 0.25 to avoid overflow in the 
FIR part. The feedback part of each biquad section must be analyzed individually and its 
maximum frequency domain gain must be computed. The inverse of this gain now forms the 
scaling factor for that section. These scaling factors are stored in disk file IIRCFLPS.DAT 
while the scaled coefficients are available the IIRCFLPC.DAT file. 



{ Lowpass IIR Filter I]RCFMLP.DSP 
Using Cascade Form Structure 

This program implements a single precision Cascade Form IIR i'ilter 
structure using subroutine biquad_sub available in disk file 
IIRCASFM.DSP . The filter coefficients used in this program are for the 
lowpass filter described in Example 7-1. These coefficients are stored 
in disk file IIRCFLPC.DAT and the scaling factors in IIRCFLPS . DAT . 
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This program is written for EZ-ICE and EZ-LAB system with EZLAB1.ACH 
architecture file. Additionally, this program uses subroutine 
CntlReg_inits to initialize all control registers. It is available in 
disk file CREGINIT . DSP . Assemble I IRCFMLP . DSP , I IRCASFM . DSP , and 
CREGINIT.DSP using ASM21.EXE. Link using LD21.EXE to proc.uce 
I IRCFMLP . EXE . Load I IRCFMLP . EXE in EZ-ICE and execute. 



.MODULE /RAM/ ABS=0/BOOT=0 IIR_DF2LP; 
.PORT write_dacO; 
.PORT load_dac; 

.CONST N = 9; { number of biquad sections, example: 9 } 

. CONST N_x_5 = 45; { number of biquad sections times live } 

. VAR/DM deliyline [4] ; j this is scratchpad memory } 

.VAR/DM scalelist [N] ; { initialize with the scale factor for each biquad 



.VAR/PM coefflist[N x 5] 



{ init with filter coefficients for each biquad 



1 



. INIT 
. INIT 



scalelist: <I IRCFLPS . DAT>; 
coefflist: <I IRCFLPC . DAT>; 



JUMP START; RTI 
RTI; RTI; RTI; 
RTI; RTI; RTI; 
j ump s amp 1 e ; RT 
RTI; RTI; RTI 
RTI; RTI; RTI 
RTI; RTI; RTI 

{ Initialize 

start: CALL CntlReg 
IO= A delaylin 
Il= A scalelis 
I4= A coef flis 
I6= A b coeffs 



RTI; RTI; 
RTI; 
RTI ; 

I; RTI; RTI; 
RTI; 
RTI; 
RTI; 



(Reset Vector} 

{irq2} 

{sportO TX} 

(sportO RX} 

UrqO} 

{irql} 

{timer} 



inits; { set up SPORTS, TIMER, etc } 

; M0=1; L0 = 0; 

t; Ml=-3; L1=N; 

t; M4 = l; L4=N_x_5; 

M5=l; L6=length; 
M2 = l; 
M3=-3; 



wait 



CNTR=4 ; 

DO zero UNTIL CE; 
DM (10, Ml) =0; 
CNTR=N; 
ICNTL=B#0011 
IMASK=B#1010 
: IDLE; 

JUMP wait; 
Process Input Sample 



( clear out the filter delay line buffer } 

X; { disable IRQ nesting, all IRQs edge-sensitive} 

00; ( enable IRQ2 and SPORTO RX interrupt } 



sample : 



SR1=RX0 
CALL biquad; 
TX0=SR1; 
SR = ASHIFT 
DM (write_dac 
DM (load_dac) 
RTI; 



( get new sample from SPORTO (from microphone) ; 

( filtered output to SPORT (to spkr) } 

SRI BY 1 (HI) ; 

0)=SR1; ( latch sample for D/A } 

=SR1; { display sample on oscilloscope } 



.ENDMOD; 



Listing 7-4: Example Program using Cascade Form Structure 



7.5.3 Limit Cycles in Cascade Form Structures 

In the case of the Cascade Form structure with more than one biquad, the limil cycles are much 
more difficult to analyze. When any biquad section, except the last one, exhibits a small-scale 
limit cycle, the output limit cycle is filtered by the succeeding sections. If the frequency of this 
limit cycle fall near the resonance frequency in a succeeding section, the amplitude of the limit 
cycle becomes very large. Likewise, if the succeeding section has a null at or near the limit 
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7.5.4 Exercises 



1. Implement the lowpass Chebyshev-I filter designed in Example 7-2 in a Cascade 
Form structure. Use Table 7-7 and properly scale numerator coefficients. Determine 
scaling factor for each section using the ideas developed above 7.5.2. Verify its 



2. Design a digital filter with the following specifications: 

Sampling frequency : 8 KHz 
Passband : - 2 KHz 
Passband ripple : 1 dB 
Prototype : Butterworth 
Filter order: 10 

Filter Transformation : Bilinear 

Determine the coefficients for the Cascade Form realization. Properly scale these 
coefficients and implement the filter using the IIRCASFM subroutine. Verify the 
operation of the filter. 



Section 6.5 described all-zero lattice filter implementation. From the discussion in Section 
7.2.3, it is obvious that all-pole lattice is very similar to the all-zero lattice except for the roles 
of input and output. Therefore the implementation of the all-pole lattice filter on the ADSP-2 1 1 
is also similar to that of the all-zero lattice filter. This implementation is shown in Listing 7-5. 

Before the all _pole lattice ^filter routine is called, various registers must be preloaded. 
The index register 10 should point to the start of the input buffer, II to the start of the coefficient 
buffer, 12 to the start of the output buffer, and 14 to the start of the filter delay line. The length 
registers L0 and L2 should both be set to zero, and LI and L4 should be set to the filter order 
to make the coefficient and the delay line buffers circular. The modify registers MO and M4 
should both be set to one; Ml and M5 should both be set to -1. The M6 should b; set to three 
and M7 to -2. The SE register, which controls data scaling, should be set to ar appropriate 
value, and the AXO should be set to the order of the filter less one. 

The routine loads the first input data value into the MYO. The outloop loop is executed 
one for each output data value. The MR register is loaded with the scaled value of /„,(«) at the 
same time the coefficient k„, and delay line value g„, _ t (n - 1 ) are loaded. The next instruction 
computes the value /„_,(«) and also loads the next multiplier operands. The dataloop loop 
performs the remainder of the filtering operation on the data point. 



operation and compare this filter with the Butterworth filter ol 
7.4.5. 



1 in Section 
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In the dataloop loop, /„,_,(«) is computed and then shifted to the proper format for the 

next multiplication. Then the value of g,„(n) is computed and stored in the delay line. After the 
dataloop loop has been executed, the pointers to the delay line and the coefficient buffer are 
moved to the tops of their buffers at the same time the output of the filter and the last delayed 
point /,(«) are saved. 







All-Pole Lattice Filter Subroutine 



I IRLATFL . DSP 



Calling Parameters 

CNTR = Length of Excitation Signal 

10 --> Excitation Signal 

11 --> Coefficient Buffer (circular) 

12 — > Output Buffer 

14 --> Delay Line Buffer (circular) 
AR = Formatted 1 
M0,M4 = 1 
M6 = 3 

SE = Appropriate scale value 
AXO = Filter Order - 1 



Return Values 

Output Buffer Filled 

Altered Registers 

MX0,MY0,MY1,MR, SR, 1 , 1 1 , 12 , 1 4 



L0 = 

LI = Filter Order 
L2 = 

L4 = Filter Order 

M1,M5 = -1 
M7 = -2 



Computation Time 

(6* (Filter Length-l)+8) * Output Buffer Length +3+6 cycles 



. MODULE all_pole_lattice_filter; 
. ENTRY p_latt; 

p_latt: MY0=DM(I0,M0) ; 

DO outloop UNTIL CE; 
CNTR=AX0; 



dataloop : 



outloop : 

1 

. ENDMOD ; 



RTS ; 



MR=AR*MY0 (SS), MX0=DM ( 1 1 , MO ) , MY0 : 
MR=MR-MX * M Y (SS), MX0=DM ( 1 1 , MO ) , 
DO dataloop UNTIL CE; 

MR=MR-MX0 *MY0 (SS); 

SR=ASHIFT MR1 (HI); 

MY1=SR1, MR=AR*MY0 (SS); 

MR=MR+MX0*MY1 (SS), MX0=DM ( 

SR=ASHIFT MR1 (HI) ; 
PM(I4,M6)=SR1, MR=AR*MY1 (S 



MY0=PM(I4,M7) 
MY0=DM(I0,M0) , 
DM(I2,M0)=MY1, 
PM(I4,M4) =SR1; 



MX0=DM(I1,M1) ; 
SR=ASHIFT MR1 (HI) ; 
SR=SR OR LSHIFT MRO 



{Get input data} 
{Loop through output} 

:PM(I4,M4) ; {Get g,k} 

MY0=PM (I4,M4) ; { MR= f 1 } 
{Loop through filter} 
{Compute fm} 
{Reformat fm} 
{Format gm+1} 
II, MO) ,MY0==PM(I4,M7) ; 

{ MR=gm } 

{Reformat gm} 
S);{Save gn format fm} 
{Reset Pointers} 
{Get new data point} 
(LO) ; {Store output} 
{Save Y 



Listing 7-5: All-Pole I 



r Subroutine - 



7.7 SUMMARY 

We have briefly discussed the design of frequency selective IIR digital fillers using analog 
prototypes. Among all analog prototypes, elliptic filters are the most efficient to implement but 
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we need commercially available filter design programs for carrying out the design. In practice, 
these filters should be considered when phase distortion is unimportant. Butterworth filters, on 
the the hand, give mild nonlinear phase responses. 

The implementation of IIR filters on the ADSP-2 1 1 was the most importar t topic of this 
chapter. We described the Direct Form and Cascade Form structures which are suitable for 
implementation on the ADSP-2101. We then discussed assembly language subroutines for 
these structures along with complete example programs. In each case we provided a thorough 
and useful discussion on coefficient scaling to avoid overflow and large-scale limit cycle 
problems. Finally, we presented the all-zero lattice filter structure and its in 
the ADSP-2101 microcomputer. 
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FAST FOURIER TRANSFORM 
IMPLEMENTATIONS 



8.1 INTRODUCTION 

Signal representation and analysis in the frequency domain is an important area in digital signal 
processing. It is performed using the discrete Fourier transform (DFT) on the data sequence. 
The DFT is widely used in applications to determine spectral contents, to perform correlation 
analysis and to implement linear filtering in the frequency domain. This widespread use is due 
to the existence of efficient algorithms for computing the DFT. In this chapter, we discuss the 
DFT and describe two common methods of efficient computation, called fast Fourier transforms 
(FFT), and explain their detailed implementation on the ADSP-2101. 

The architecture and the instruction set of the ADSP-2101 processor is very well suited 
for implementation of the FFT algorithms. These algorithms repeatedly perform a core mul- 
tiply/add operation, called a butterfly, on ordered pairs of data points and either require a 
scrambled order of the input data or result in a scrambled order of the output data. This requires 
a very efficient address generation which can be accomplished by two data address generators 
(DAGO and DAG1) of the ADSP-2101. Furthermore, DAG1 contains a bit-reversal hardware 
which can be used to scramble or unscramble the data on the fly. 

The chapter begins with a brief review of the DFT in Section 8.2. Two FFT algorithms 
are then considered in the next two sections. In Section 8.3, the decimation-in-time (DIT) FFT 
is first described and then a complete discussion on its implementation using program modules 
is provided. A similar treatment is done in Section 8.4 for the decimation-ir -frequency (DIF) 
FFT algorithm. Finally in Section 8.5, the inverse DFT is introduced and its implementation 
is discussed. 
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8.2 THE DISCRETE FOURIER TRANSFORM 

The frequency analysis of a sampled signal x(n) is performed using the Discrete- Time Fourier 
Transform (DTFT) Xfe* ) given by 

X(e ja ) = lx{n)e- iw 

n 

However, the DTFT is a continuous function of (0. Therefore its numerical evaluation requires 
sampling in the frequency domain. If the sequence x(n) is of finite length N, then N samples of 
its DTFT X(e* m ) over to 2k interval completely describe the sequence x(n). This sampled 
version of X(^ m ) is evaluated using the discrete Fourier Transform (DFT) X(k) given by 

X(k) = Ii(n)e"'" , lfc = 0,...,JV-l 

To simplify the notation, it is desirable to define the complex-valued phase factor W N , which is 
the Nth root of unity, as 



Then the DFT equation becomes 



W N = e' 2nlN 



N - 1 



X{k) = lx(n)W™ , k=0,...,N-l (8-1) 

n =0 

A complex summation of N complex multiplications (4N real multiplications) and AM 
complex additions (4N-2) real additions) is required for each of the N output samples. Con- 
sequently, to compute all N values of the DFT requires N 2 complex multiplications and N 2 - N 
complex additions. This direct computation of the DFT is basically inefficient primarily because 
it does not exploit the symmetry and periodicity properties of the phase factor W N . The time 
burden created by this large number of calculations limits the usefulness of the DFT in many 
applications. For this reason, tremendous amount of a effort was devoted to developing more 
efficient ways of computing the DFT. This effort produced computationally efficient algorithms 
which are collectively known as fast Fourier transform (FFT) algorithms. 

There are two different approaches to efficiently computing the DFT. In the fi rst approach, 
a divide-and-conquer strategy is employed in which an N-point DFT, where A 7 is a composite 
number, is reduced to the computation of smaller DFTs from which the larger DFT is computed. 
In this chapter, we will present this approach for computing the DFT when the size N is a power 
of 2. In particular, we will describe two important algorithms, the radix-2 decims tion-in-time 
fast Fourier transform (DIT FFT) and the radix-2 decimation-in-frequency fast Fourier transform 
(DIF FFT), and their implementation on the ADSP-2101 . 

The second approach is based on the formulation of the DFT as a linear filteri ig operation 
on the data. This approach leads to two algorithms, the Goertzel algorithm and the chirp-z 
transform algorithm. The Goertzel algorithm is described in Section 9.6. 
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8.3 RADIX-2 DECIMATION-IN-TIME FFT 

The decimation-in-time (DIT) FFT divides the input (time) sequence into two groups, one of 
even samples and the other of odd samples. N/2-point DFTs are performed on these sub- 
sequences, and their outputs are combined to form the N-point DFT. 

8.3. 1 The Algorithm 

The decimation-in-time methodology is illustrated by the following equations (Proakis 
and Manolakis [8]). First, x{n), the input sequence in equation (8-1), is divided into even and 



nces: 



-1 



X(k) = I x(2n )W 2 N " k + 2 x{2n + 1 W| 

„=0 



n-0 
-1 



By the substitutions 

x l (n)=x(2n) 
x 2 (n)=x(2n + 1) 

this equation becomes 



! +W$ 2 x(2n + l)W 2 " k ,k=0,...,N-\ 

n=0 



(8-2) 



<\" k _ W n< 



X(k) = Ix^Wm + K 2^{n)Wf fl 



= y(jt)+wi5z(it) , /t=o,,..,Ar-i 



(8-3) 



Equation (8-3) is the sum of two A72-point DFTs (Y(k) and Z(k)) performed on the 
sub-sequences of even and odd samples, respectively, of the input sequence, x(n ). Multiples of 
W N (called "twiddle factors") appear as coefficients in the FFT calculation. In equation (8-3), 
Z(k) is multiplied by the twiddle factor W k N . 

Because W k N +Na = (e'^f ■ {eT iMH f li = -W*. equation (8-3) can also be expressed as two 
equations: 

X(k) = Y(k) + W k Z(k) 

X{k+NI2) = Y{k)-W k N Z{k) , *=0,...,|-1 (8-4) 



Together these equations form an /V -point FFT. Figure 8-1 illustrates this first decimation of 
the DFT. 
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Figure 8-1 : First Decimation of DIT FFT 



The two yV/2-point DFTs (Y(k) and Z(k)) can be divided to form four/V/^-point DFTs, 
yielding equation pairs (8-5) and (8-6). 



Y(k) = U(k) + W?V(k) 
Y(k+N/4) = U(k)-Wl k V(k) 

Z(k) = R(k) + W* k S(k) 

= R(k)-W 2 N k S(k) 



N 



n =0,...,^- 1 



(8-5) 



(8-6) 



U(k) and V(k) are A//4-point DFTs whose input sequences are created by dividing *,(«) into 
even and odd sub-sequences. Similarly, R(k) and S(k) are /V74-point DFTs performed on the 
even and odd sub-sequences of x 2 (n ). Each of these four equations can be divided to form two 
more. The final decimation occurs when each pair of equations together computes a two-point 
DFT (one point per equation). The pair of equations that make up the two-point DFT is called 
a radix-2 "butterfly. " The butterfly is the core calculation of the FFT. The entire FFT is performed 
by combining butterflies in patterns determined by the FFT algorithm. 

A complete eight-point DIT FFT is illustrated graphically in Figure 8-2. Each pair of 
arrows represents a butterfly. Notice that the entire FFT computation is made up of butterflies 
organized in different patterns, called groups and stages. The first stage consists of four groups 
of one butterfly each. The second stage has two groups of two butterflies, and the last has one 
group of four butterflies. Every stage contains N/2 (four) butterflies. Each butterfly has two 
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input points, called the dual node and the primary node. The spacing between the nodes in the 
sequence is called the dual-node spacing. Associated with each butterfly is a twiddle factor 
whose exponent depends on the group and stage of the butterfly. 



Dual-Node 
Spacing 



Dual-Node 
Spacing 



Dual-Node 
Spacing 




Stage 3 



Figure 8-2: Eight-Point DIT FFT 



Notice that whereas the output sequence is sequentially ordered, the input sequence is not. 
This is an effect of repeatedly dividing the input sequence into sub-sequences of even and odd 
samples. It is possible to perform an FFT using input and output sequences in other orders, but 
these approaches generally complicate addressing in the FFT program and can require a different 
butterfly. In this section, we have opted to scramble the input sequence of the DIT FFT because 
this approach uses twiddle factors in sequential order, produces the output sequence in sequential 
order, and requires a relatively simple butterfly. The scrambling of the inputs is achieved by a 
process called bit reversal, which is described later in this chapter. 

A generalized butterfly flow graph is shown in Figure 8-3. The variables x and y represent 
the real and imaginary parts, respectively, of a sample. The twiddle factor can be divided into 
real and imaginary parts because W N = e~ j2nm = cos(2ti/W) -j sin(2n/N). In the program pres- 
ented later in this section, the twiddle factors are initialized in memory as cosine and -sine values 
(not +sine). For this reason, the twiddle factors are shown in Figure 8-3 as C + j( -5 ). C represents 
cosine and -5 represents -sine. 

The dual node (jr, + yy,) is multiplied by the twiddle factor C + j(—S). The result of this 
multiplication is added to the primary node (,v + jy ) to produce x + jy and su Jtracted from the 
primary node to produce x, +jy[. Equations (8-7) through (8-10) calculate the re al and imaginary 
parts of the butterfly outputs. 
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Figure 8-3: Radix-2 DIT FFT Butterfly 



Dual 




x o +iy o 



x = x + (C)^-(-5)^ 
x, = x -(C)x l -(-S)y l 

y> = yo-(C)y l + (-S)x ] 



(8-7) 
(8-8) 
(8-9) 
(8-10) 



The butterfly produces two complex outputs that become butterfly inputs in the next stage 
of the FFT. Because each stage has the same number of butterflies (N/2), the number of butterfly 
inputs and outputs remains the same from one stage to the next. An "in-place" implementation 
writes each butterfly output over the corresponding butterfly input (x overwrites x , etc.) for 
each butterfly in a stage. In an in-place implementation, the FFT results end up in the same 
memory range as the original inputs. 

8.3.2 A DIT FFT Program 

The flow chart for the DIT FFT program is shown in Figure 8-4. The FFT program is divided 
into three subroutines. The first subroutine scrambles the input data. The next subroutine 
computes the FFT, and the third scales the output data. 

Four modules are created. The main module declares and initializes data buffers and calls 
subroutines. The other three modules contain the FFT, bit reversal, and block floating-point 
scaling subroutines. The main module and the FFT module are described in this section. The 
bit reversal and block floating-point scaling modules are described later in this section. 

Main Module 

The ditjft main module is shown in Listing 8-1 . N is the number of points in the FFT (in this 
example, N = 1024) and N_div_2 is used for specifying the lengths of buffers. The number of 
points in the FFT can be changed by changing the value of these constants and the twiddle 
factors. The data buffers twidreal and twidimag in program memory hold the tv/iddle factor 
cosine and sine values. The inplacereal, inplaceimag, inputreal and inputimag buffers in data 
memory store real and imaginary data values. Sequentially ordered input data is stored in 
inputreal and inputimag. This data is scrambled and written to inplacereal and inplaceimag, 
which are the data buffers used by the in-place FFT. A four-location buffer called padding is 
placed at the end of inplaceimag to allow data accesses to exceed the buffer length. If no padding 
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Figure 8-4: Radix-2 DIT FFT Flow Chart 
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was placed after inplaceimag and the program attempted to read undefined memory locations, 
the ADSP-2101 Simulator would signal an error. This buffer assists in debugging but is not 
necessary in a real system. Variables (one-location buffers) groups, bflys jperjgroup, blk_ex- 
ponent and node space are declared last. 

The real part (cosine values) of the twiddle factors in the twid_real.dat file are placed in 
the buffer twidreal. Likewise, twid_imag.dat is placed in twidjmag. The variable groups is 
initialized to N_div_2, and bflys _per_group and node space are initialized to one because there 
are N/2 groups of one butterfly in the first stage of the FFT. The blk exponent is initialized to 
zero. This exponent value is updated when the output data is scaled. 

Two subroutines are called. The first subroutine places the input sequence in bit-reversed 
order. The second performs the FFT and calls the block floating-point scaling routine. 



DIT FFT Main program 



DIT MAIN. DSP 



} 



This program computes a 1024-point DFT of data values stored data 
buffers using DIT FFT algorithm. The input data is assumed to be 
complex. The real and imaginary input data are stored in inputreal 
and inputimag buffers, respecively. The input data is read from the 
disk files INPUTREA.DAT and INPUTIMA.DAT. The real and imaginary parts 
of twiddle factors are read from disk files TWID_REA.DAT and TWID_ 
IMA. DAT, respectively. The output DFT values are available in data 
memory locations inplacereal and inplaceimg. 



.M0DULE/ABS=4 
. CONST 

.VAR/PM/RAM/CIRC 
.VAR/PM/RAM/CIRC 
.VAR/ DM/RAM/ ABS=0 
.VAR/DM/RAM/ABS=0xl000 
.VAR/DM/RAM 



dit_f ft_main; 
N=1024, N_div_2=512; 
twid_real [N_div_2]; 
twid_imag [N_div_2]; 
inplacereal [N] , inplaceimag 
inputreal [N] , inputimag [N] 
groups, bf lys_per_group, 
nodespace, blk_exponent ; 



Const, for 1024 points) 



[N] ; 

padding [4] , 



. INIT 
. INIT 
. INIT 
. INIT 
. INIT 
. INIT 
. INIT 
. INIT 
. INIT 
. INIT 

. GLOBAL 
. GLOBAL 
. GLOBAL 
. GLOBAL 



twid_real: <twidreal . dat>; 
twid_imag: <twid_imag.dat>; 
inputreal: <inputreal . dat>; 
inputimag: <inputimag . dat>; 
inplaceimag: < i nputimag.dat >; 
groups: N_div_2; 
bf lys_per_group : 1; 
node_space: 1; 
blk_exponent : 0; 
padding: 0,0,0,0; 



{Zeros after inplaceimag} 



inplacereal, inplaceimag; 
inputreal, inputimag; 
twid_real, twid_imag; 

groups, bf lys_per_group, node_space, blk_exponent; 



.EXTERNAL scramble, fft_strt; 
CALL scramble; 
CALL fft_strt; 
TRAP; 

.ENDMOD; 

Listing 8-1: Main Module, Radix-2 DIT FFT ■ 



{Subroutine cal..s} 
{Halt program} 
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DITFFT Module 

The FFT routine uses three nested loops. The inner loop computes butterflies, the middle loop 
controls the grouping of these butterflies, and the outer loop controls the FFT stage character- 
istics. These loops are described separately in the following sections. The complete routine is 
presented in Listing 8-7. 

Butterfly Loop 

The butterfly calculation involves a complex multiplication, a complex addition, and a complex 
subtraction. These operations can potentially cause the butterfly data to grow jy two bits from 
input to output. For example, if x is 0x07FF (five sign bits), x could be Ox I OOF (three sign 
bits). Because of this bit growth, precautions must be taken to ensure that 16-bit data never 




An example of bit 
Bit Growth: 

Input to butterfly OxOFOO 
Output from butterfly OxlEOO 

Overflow: 

Input to butterfly 0x7000 
Output from butterfly OxEOOO 

In overflow, the positive number 0x7000 is 



overflow is shown below. 



= 0000 1111 0000 0000 
= 0001 1110 0000 0000 

= 0111 000000000000 
= 1110 0000 0000 0000 

tive number, resulting in 



OxEOOO, which is too large to represent as a positive, signed 16-bit number. OxEOOO is erro- 
neously interpreted as a negative number. 

To avoid errors caused by overflow, one of three methods of compensating for bit growth 
can be applied: 

Input data scaling 

• Unconditional block floating-point scaling (output data) 

• Conditional block floating-point scaling (output data) 

Three different code segments for the butterfly calculation are presente d in this section; 
each uses a different method of compensating for bit growth. 

One way to ensure that overflow never occurs is to include enough extn sign bits, called 
guard bits, in the FFT input data to ensure that bit growth never results in overflow (Rabiner 
and Gold, [1 1]). Data can grow by a maximum factor of 2.4 from butterfly input to output (two 
bits of growth). However, a data value cannot grow by this maximum amount in two consecutive 
stages. The number of guard bits necessary to compensate for the maximum possible bit growth 
in an N-point FFT is \og 2 N + 1. For example, each of the input samples of a 32-point FFT 
(requiring five stages) must contain six guard bits, so ten bits are available for data (one sign 
bit, nine magnitude bits). This method requires no data shifting and is therefore the fastest of 
the three methods discussed in this section. However, for large FFTs the resolution of the input 
data is greatly limited. For small, low-precision FFTs, this is the fastest and most efficient 
method. 
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The code segment for a butterfly with no shifting is shown in Listing 8-2. This section of 
code computes one butterfly equation while setting up values for the next butterfly . The butterfly 
outputs (x ,y ,x, and}',) are written over the inputs to the butterfly (x ,y ,x, andy,') in the 
boldface instructions. The input and output parameters of the butterfly loop are shown below. 

Initial Conditions 

MX0=x, 
MY0 = C 
MY1 = (-S) 
10 ^x 
II 

12 ->y 
13 

14 -> next C 

15 -» next (-5) 

16 ->y, 

CNTR = butterfly count 
M0 = 
Ml = 1 

M4 = twiddle factor modify value 
M5= 1 



Final Conditions 

MXO = next x, 
MYO = next C 
MY I = next (-S) 

10 — > next x 

11 — ¥ nextx, 

12 — > next y 

13 — » next y, 

14 -» C after next C 

15 -» (-5) after next (-S) 

16 — > next y , 

CNTR = butterfly count -1 



MR=MX0*MY0 (SS) , MX1=DM ( 1 6, M5 ) ; {MR=xl (C) ,MXl=yl} 

MR=MR-MX1 *MY1 (RND) , AY0=DM ( 1 , MO ) ; {MR=xl (C) -yl (-S) ,AY0=x0} 
AR=MR1+AY0,AY1=DM(I2,M0) ; {AR=xO+ [xl (C) -yl (-S) ] ,AYl=yO} 

AR=AY0-MR1,DM(I0,M1)=AR; {AR=xO- [xl (C) -yl (-S) ] } , 

{ xO '=xO+[xl (C) -yl (-S) ] } 
MR=MX0*MY1 (SS) , DM (II , Ml) =AR; {MR=xl (-S) ,xl '=x0- [xl (C) -yl (-S) ] } 

MR=MR+MX 1 *MY (RND) , MX0=DM (II , MO ) , MY1=PM ( 15 , M4 ) ; 

{MR=xl (-S) +yl (C) ,MX0=next xl,} 
{ MYl=next (-S) } 

AR=AY1-MR1,MY0=PM(I4,M4) ; { AR=yO- [ xl (-S ) +y 1 (C) ] , MY0=ni2xt C} 

AR=MR1+AY1,DM(I3,M1)=AR; {AR=yO+[xl (-S)+yl (C) ] , } 

{ yl '=yO-[xl (-S)+yl (C) ] } 
DM(I2,M1)=AR; {y0 '=y0+ [xl (-S)+yl (C) ] } 



Listing 8-2: DIT FFT Butterfly, Input Data Scaled 



Another way to compensate for bit growth is to scale the outputs down by a factor of two 
unconditionally after each stage. This approach is called unconditional block floating-point 
scaling. Initially, two guard bits are included in the input data to accommodate the maximum 
bit growth in the first stage. In each butterfly of a stage calculation, the data can grow into the 
guard bits. To prevent overflow in the next stage, the guard bits are replaced before the next 
stage is executed by shifting the entire block of data one bit to the right and updating the block 
exponent. This shift is necessary after every stage except the last, because no overflow can occur 
after the last stage. 

The input data to an unconditional block floating-point FFT can have at most 14 bits (one 
sign bit and 13 magnitude bits). In the FFT calculation, the data loses a total of (log 2 A0-l bits 
because of shifting. Unconditional block floating-point scaling results in the same number of 
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bits lost as in input data scaling. However, it produces more "precise results because the FFT 
starts with more precise input data. The tradeoff is a slower FFT calculation because of the extra 
cycles needed to shift the output of each stage. 

The code for the unconditional block floating-point butterfly is shown in Listing 8-3. 
Instructions that write butterfly results to memory are boldface. After the last stage of the FFT, 
no compensation for bit growth is needed, so a butterfly with no shifting can be used in the last 



Initial Conditions 

SRO = last y ' 
MX0=x l 
MX1 = y, 
MY0 = 'C 
MY1 = (-S) 
W^>x 

11 — » jr, 

12 — > last y ' 
13 

14 -> next C 

15 -» next (-S) 

16 — » next_y, 
CNTR = butterfly count 
M0 = 
Ml = 1 

M4 = twiddle factor modify value 
M5= 1 
SE = -1 



Final 

SR0 = V 
MXO = next jc, 
MX1 =next>> 1 
MYO = next C 
MY1 =next (-5) 

10 — > next x 

11 — » nextx, 
I2-»;y ' 

13 — > next y , 

14 -> C after next C 

15 -> (-S) after next (-S) 

16 after next y, 
CNTR = butterfly count -1 



MR=MX0*MY0 (SS) , DM (12 , Ml) =SR0; 

MR=MR-MX1*MY1 (RND) , AY0=DM ( I , MO ) ; 
AR=MR1+AY0,AY1=DM(I2,M0) ; 
SR=ASHIFT AR(LO); 
DM ( 10 , Ml ) =SR0 , AR=AY0-MR1 ; 

SR=ASHIFT AR(LO) ; 
DM(I1,M1)=SR0,MR=MX0*MY1 (SS) ; 
MR=MR+MX1*MY0 (RND) , MX0=DM ( II , MO) ,MY1 



AR=AY1-MR1,MY0=PM(I4,M4) ; 
SR=ASHIFT AR(LO) , MX1=DM (16, M5) ; 

DM ( 13 , Ml ) =SR0 , AR=MR1 +AY1 ; 

SR=ASHIFT AR(LO) ; 



xl(C),last y0=last yO ' } 
{MR=xl (C) -yl (-S) ,AY0=x0} 
{AR=x0+[xl (C)-yl (-S) ] ,AYl==y0} 
(Shift result right 1 bit ■ 
(x0 '=x0-[xl (C)-yl (-S) ] , } 
{ AR=xO- [xl (C) -yl (-S) ] } 
{Shift result right 1 bit • 
(xl '=x0- [xl (C) -yl (-S) ] ,MR==xl (-S) ] } 
=PM(I5,M4) ; 

(MR=xl (-S) -yl (C) ,MX0=next 
( xl,MYl=next (-S) ) 

(AR=yO- [xl (-S) -yl (C) ] ,MY0==next C) 

{Shift result right 1 bit, } 

{ MXl=next yl } 

{yl '=y0-[xl (-S) -yl (C) , } 

{ AR=y0+[xl (-S) -yl (C) ] } 

{Shift result right 1 bit. 



Listing 8-3: DIT FFT Butterfly, Unconditional Block Floating-Point Scaling ■ 



In conditional block floating-point scaling, data is shifted only if bit growth occurs. If one 
or more outputs grows, the entire block of data is shifted to the right and the b ock exponent is 
updated. For example, if the original block exponent is and data is shifted three positions, the 
resulting block exponent is +3. 
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The code segment for the conditional block floating-point butterfly is she wn in Listing 
8-4. As in the other types of butterflies, one butterfly equation is calculated and its outputs (x , 
y , jc, and y,) are written over its inputs (x , y , j*r , and v,) in the boldface instructions. 

The conditional block floating-point butterfly checks each butterfly output for growth 
with the EXPADJ instruction. This instruction does no shifting; instead, it monitors the output 
data and updates the SB register if bit growth is detected. (See the ADSP-2101 User's Manual 
for a complete description of this instruction.) If shifting is necessary it is performed after the 
entire stage is complete (in the block floating-point scaling routine). The butterfly c ode computes 
one butterfly equation while setting up values for the next butterfly. The input and output 
parameters of the butterfly loop are as follows: 



Initial Conditions 



Final Conditions 



MX0=x, 
MX1 = y, 
MY0 = C 
MY1 = (-5) 

10 x 

11 ->*, 

12 y 

13 y, 

14 next C 

15 next (-5) 

CNTR = butterfly count 
Ml = 1 

M4 = twiddle factor modify value 
M0 = 

[ block exponent for this stage 



MXO = next x x 
MX1 = nexty, 
MYO = next C 
MY I = next (-S) 

10 nextx 

11 next*, 
next y 
next y, 

C after next C 
(-S) after next (-S) 



12 
13 
14 
15 



CNTR = butterfly count -1 



MR=MX0*MY1 (SS) , AX0=DM ( 1 , MO ) ; {MR=xl (-S) , AX0=x0} 

MR=MR+MX 1 * MY (RND) , AX1=DM ( 12, MO) ; {MR=[yl (C) +xl (-S) ) ;AXl=yO) 
AY1=MR1 , MR=MX0 *MY0 (SS) ; { AY1= [y 1 (C) +xl ( -S) ] ; MR=xl (C i ) 

MR=MR-MX1*MY1 (RND) ; (MR= [xl (C) -yl (-S) ] } 

AY0=MR1,AR=AX1-AY1; {AY0=[xl (C) -yl (-S) ] , ) 

{ AR=y0-[yl (C)+xl (-S) ] ) 
SB=EXPADJ AR,DM(I3,M1)=AR; (check for bit growth,} 

{ yl "=y0-[yl (C)+xl (-S) ] ) 
AR=AX0-AY0,MX1=DM(I3,M0) , MY1=PM ( 1 5 , M4 ) ; 

{AR=x0- [xl (C) -yl (-S) ] ,MXl=next yl, } 

( MYl=next S) 

SB=EXPADJ AR,DM(I1,M1)=AR; {check for bit growth,} 

{ xl '=x0-[xl (C)-yl (-S) } } 

AR=AXO+AYO,MX0=DM (II, MO) , MY0=PM ( 1 4 , M4 ) ; 

{AR=xO+ [xl (C) -yl (-S) ] ,MX0=ns;xt xl, } 
( MY0=next C} 

SB=EXPADJ AR,DM(I0,M1)=AR; {check for bit growth,} 

{ xO '=x0+ [xl (C) -yl (-S) ] } 
AR=AX1+AY1; {AR=yO+ [yl (C) +xl (-S) ] } 

SB=EXPADJ AR,DM(I2,M1)=AR; {check for bit growth,} 

{ yO "=y0+[yl (C) +xl (-S) ] } 



Listing 8-4: DIT FFT Butterfly, Conditional Block Floating-Point Scaling 
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Group Loop 

The group loop controls the grouping of butterflies. It sets pointers to the input data and twiddle 
factors of the first butterfly in the group, initializes the butterfly counter and sets up the butterfly 
loop for each group. 

The code segment for the group loop is shown in Listing 8-5. This code is designed for 
the conditional block floating-point butterfly and thus requires slight modificc.tion for use with 
the other types (input scaling, unconditional block floating-point) of butterflies. The first but- 
terfly of every group in the first stage of the DIT FFT has a twiddle factor of W°. Thus, 14 and 
15 are initialized to point to the cosine and sine values of W° before the butterfly loop is entered. 
In the group loop, the butterfly counter is initialized and initial butterfly data is fetched. The 
butterfly loop is executed bflys_per_group times to compute all butterflies in the group. After 
the butterfly loop is complete, pointers 10, 1 1 , 12 and 13 are updated with the MODIFY instruction 
to point to x , jc,, y and y, of the first butterfly in the next group. The group loop is executed 
groups times. 

The input and output parameters of the group loop are as follows: 

Initial Conditions Final Conditions 

10 — > x of first butterfly in group 10 — » x of first butterfly in next group 

11 — > X] of first butterfly in group II — > x x of first butterfly in next group 

12 — ¥ y of first butterfly in group 12 — > y„ of first butterfly in next group 

13 — » y, of first butterfly in group 13 -» y, of first butterfly in next group 
CNTR = group count CNTR = group count - 1 

M2 = node_space 



I4="twid_real; 

I5="twid_imag; {Initialize twiddle factor pointers} 

CNTR=DM (bf lys_per_group) ; (Initialize butterfly counter} 
MY0=PM(I4,M4) ,MX0=DM(I1,M0) ; {MY0=C, MX0=xl } 

MY1=PM(I5,M4) ,MX1=DM(I3,M0) ; {MY1= (-S) ,MXl=yl } 

DO bfly_loop UNTIL CE; 

bfly_loop: (Calculate All Butterflies in Group} 

MODIFY (10, M2) ; (10 first xO in next group} 

MODIFY (II, M2) ; (II first xl in next group} 

MODIFY ( 12, M2 ) ; (12 first yO in next group} 

group^loop: MODIFY (13, M2) ; (13 first yl in next group} 

Listing 8-5: Radix-2 DIT FFT Group Loop 



Stage Loop 

The stage loop controls the grouping characteristics of the FFT. These include the number of 
groups in a stage, the number of butterflies in each group, and the node spacing. The stage loop 
also calls a subroutine which performs conditional block floating-point scaling on the outputs 
of a stage calculation. Note that if unconditional block floating-point scaling or i iput data scaling 
were used, this call would be omitted. 
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The stage loop code for a conditional block floating-point FFT is shown in 1 
The stage loop sets up the group loop by initializing 10, II, 12 and 13 to point to J[q, jC| , _y and y^ 
respectively, for the first butterfly in the first group. It also initializes the group loop counter 
and node space modifier so that pointers can be updated for new groups. The value of the twiddle 
factor exponent is increased by groups for each butterfly. M4, initialized to groups, is the modifier 
for the twiddle factor pointers. 

The group loop calculates all groups in the stage. After the group loop is complete, a block 
floating-point subroutine is called to check the stage outputs for bit growth and scale the data 
if necessary. The stage characteristics are then updated for the next stage; bflys _per group and 
node_space are doubled and groups is divided by two. 

The input and output parameters for the stage loop are as follows. Note that all the 
parameters except the stage count are passed in memory. 



Initial Conditions 

groups=# groups current stage 
bflys _per_group=# butterflies/group 

node _space=node spacing current stage 

CNTR=stage count 
inplacereal=rea\ stage input data 
inplaceimag=imaig. stage input data 



Final Conditions 

groups=# groups next stage 
bflys _per_group=# butterflies/ 
group next stage 
node spac e=node spacing 
next stage 

CNTR=stage count - 1 
inplacereal=rea\ stage output dat; 
inplaceimag=\m&g. stage output 
data 



IO= A inplacereal; 
I2="inplaceimag; 
SB=-2 

SI=DM (groups) ; 

CNTR=SI; 

M4=SI; 

M2=DM (node_space) ; 
11=10; 

MODIFY (II, M2) ; 
13=12; 

MODIFY (13, M2) ; 

DO group_loop UNTIL CE; 



group_loop: 



CALL bfp_adj; 

SI=DM (bf lys_per_group) ; 

SR=ASHIFT SI BY 1 (LO) ; 

DM (node_space) =SR0; 

DM (bf lys_per_group) =SR0; 

SI=DM (groups) ; 

SR=ASHIFT SI BY -1 (LO) ; 

DM (groups) =SR0; 



{10 first xO in first group of stage} 
{12 first yO in first group of stage} 
{SB = -(number of guard bit.s) } 
{SI = groups} 

{Initialize group counter} 
{Initialize twiddle factor modifier- 
{Initialize node spacing modifier} 

{II first xl of first group in stage} 

{13 first yl of first group in stage} 

{Compute All Groups in Stage} 
{Adjust stage output for bi'; growth} 



{node_space=node_space ¥ 2} 

{bf lys_per_group=bf lys_per_(jroup ¥ 2} 



(groups = groups + 2} 



Listing 8-6: Radix-2 DIT FFT Stage Loop 
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DIT FFT Subroutine 

The complete conditional block floating-point radix-2 DIT FFT routine is shown in Listing 8-7. 
The constants N and log 2 /V are the number of points and the number of stages in the FFT, 
respectively. To change the number of points in the FFT, you modify these constants. Notice 
that the length and modify registers (that retain the same values throughout the FFT calculation) 
and the stage counter are initialized before the stage loop is executed, instructions that write 
butterfly results to memory are boldface. 



{ Radix-2 DIT FFT Subroutine 
Performs Radix-2 DIT FFT 



R2DITFFT . DSP 



Calling Parameters 

inplacereal = Real input data in scrambled order 

inplaceimag = All zeroes (real input assumed) 

twid_real = Twiddle factor cosine values 

twid_imag = Twiddle factor sine values 

groups = N/2 

bf lys_per_group = 1 

node_space = 1 

Return Values 

inplacereal = Real FFT results in sequential order 
inplaceimag = Imaginary FFT results in sequential ordsr 



Altered Registers 

10, II, 12, 13, 14, 15, LO, LI, L2, L3,L4,L5 
M0,M1,M2,M3,M4,M5 
AXO , AX1 , AYO , AY1 , AR, AF 
MX0,MX1,MY0,MY1,MR, SB, SE, SR, SI 

Altered Memory 

inplacereal, inplaceimag, groups, node_space, 
bf lys_per_group, blk_exponent 



} 

. MODULE radix2 dit fft; 



. CONST logN=10, N=1024; 

.EXTERNAL twid_real, twid_imag; 

.EXTERNAL inplacereal, inplaceimag; 

.EXTERNAL groups, bf lys_per_group, node_space; 

.EXTERNAL bfp_adj; 

.ENTRY fft strt; 



Set constants for N-point FFT) 



fft_strt: CNTR=logN; 
M0 = 0; 
Ml = l; 
L1 = 0; 
L2=0; 
L3=0; 

L4=%twid_real; 

L5=%twid_imag; 

DO stage_loop UNTIL CE; 
I 0=^ inplacereal, • 
I 2=~ inplaceimag; 
SB=-2 

SI=DM (groups) ; 

CNTR=SI; 

M4=SI; 

M2=DM (node_space) ; 
11=10; 

MODIFY (II, M2) ; 
13=12; 

MODIFY (13, M2) ; 



{Initialize stage counter) 



(Compute all stages in FFT) 
{10 xO in 1st grp of stage) 
{12 yO in 1st grp of stage) 
{SB to detect data > 14 b..ts) 

{CNTR = group counter) 
{M4=twiddle factor modifier) 
{M2=node space modifier) 

{II xl of 1st grp in stage) 

{13 yl of 1st grp in stage) 
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DO group_loop UNTIL CE; 

I4="twid_real; {14 C of W } 

I5=~twid_imag; {15 (-S) of W°} 

CNTR=DM (bf lys_per_group) ; { CNTR = butterfly counter} 
MY0=PM(I4,M4) ,MXO=DM(I1,MO) ; {MY0=C, MX0=xl } 
MY1=PM(I5,M4) ,MX1=DM(I3,M0) ; {MY1=-S, MXl=yl } 
DO bfly_loop UNTIL CE; 

MR=MX0*MY1 (SS) , AX0=DM ( I , MO ) ; {MR=xl (-S) ,AX0=x0} 
MR=MR+MX1*MY0 (RND ) , AX1=DM { 12 , MO ) ; 

{MR=(yl (C) +xl (-S) ) ,AXl=yO} 
AY1=MR1,MR=MX0*MY0 (SS) ; {AYl=yl (C) +xl (-S) ,MR-xl (C) } 
MR=MR - MX 1 * M Y 1 (RND) ; {MR=xl (C) -yl (-S) } 

AY0=MR1 , AR=AX1-AY1 ; {AY0=xl (C) -yl (-S) , } 

{AR=y0-[yl (C)+xl (-S) ] } 
SB=EXPADJ AR,DM(I3,M1)=AR; {Check for bit growth,} 

{yl "=yO-[yl (C)+xl (-S) ] } 
AR=AX0-AY0,MX1=DM(I3,M0) , MY1=PM ( 15 , M4 ) ; 

{AR=xO-[xl (C) -yl (-E) ] , } 
{MXl=next yl,MYl=next (-S) } 
SB=EXPADJ AR,DM(I1,M1)=AR; {Check for bit growth,} 

fxl '=x0- [xl (C) -yl (-S) ] } 
AR=AX0+AY0,MX0=DM(I1,M0) , MY0=PM ( 1 4 , M4 ) ; 

{AR=xO+ [xl (C) -yl (-S) ] , } 
{MX0=next xl,MY0=next C} 
SB=EXPADJ AR,DM(I0,M1)=AR; {Check for bit growth, } 

{xO'=xO+[xl (C)-yl (-S) ] } 
AR=AX1+AY1; {AR=yO+ [yl (C) +xl (-S) ] ) 

bfly_loop: SB=EXPADJ AR, DM (12 , Ml) =AR; {Check for bit growth, } 

{yO -=yO+[yl (C)+xl (-S) ] } 
MODIFY (10, M2) ; {10 1st xO in next group} 

MODIFY (II, M2) ; {II 1st xl in next group} 

MODIFY (12, M2) ; {12 1st yO in next group} 

group_loop: MODIFY ( 1 3 , M2 ) ; {13 1st yl in next group} 

CALL bfp_adj; {Compensate for bit growth) 

SI=DM (bf lys_per_group) ; 
SR=ASHIFT SI BY 1 (LO) ; 

DM (node_space) =SR0 ; { node_space=node_space x 2} 

DM (bf lys_per_group) =SR0; 
{bf lys_per_group= } 

{bf lys_per__group x 2} 

SI=DM (groups) ; 
SR=ASHIFT SI BY -l(LO); 
stage__loop: DM (groups ) =SR0 ; { groups=groups + 2} 

RTS; 

— Listing 8-7: Radix-2 DIT FFT Routine, Conditional Block Floating-Point — 



Bit Reversal 

Bit reversal is an addressing technique used in FFT calculations to obtain results in sequential 
order. Because the FFT repeatedly subdivides data sequences, the data and/or twiddle factors 
may be scrambled (in bit-reversed order). All radix-2 FFTs can be calculated with either the 
input sequence or the output sequence scrambled. The twiddle factors may also need to be 
scrambled, depending on the order of the input and output sequences. In this chap ter, however, 
input and output sequences are set up so that twiddle factors are never scrambled. This simplifies 
the FFT explanation as well as the program. 

As described earlier, the input sequence to the radix-2 DIT FFT is scrambled before the 
FFT is performed. This scrambling is accomplished through bit reversal. Bit reversal operates 
on the binary number that represents the position of a sample within an array of samples. The 
bit-reversed position is the transpose of the bits of the binary number about its center; for example 
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the transpose of the 3-bit binary number 100 is 001. (In this example, three b;ts represent eight 
positions, so bits zero and two are interchanged.) Four bits are needed to represent 1 6 positions, 
so in a 16-point sequence, bits zero and three and bits one and two would be interchanged. A 
1024-point sequence requires the reversal of ten bits. 

The ADSP-2101 has a bit-reverse capability built into its data addiess generator #1 
(DAG1). When a mode bit is enabled (through software), the 14-bit address generated by DAG1 
is automatically bit-reversed for any data memory read or write. The two address generators of 
the ADSP-2101 greatly simplify bit reversal. One address generator can be used to read 
sequentially ordered data, and the other can be used to write the same data to its bit-reversed 
location. Because the address generators are independent, intermediate enabling and disabling 
of the bit-reverse mode is not needed. 

In many cases, fewer than 1 4 bits must be reversed (for example, an eighl -point FFT needs 
only three bits reversed). Reversal of fewer than 14 bits is accomplished by adding the correct 
modify value to the address pointer after each memory access. The following example dem- 
onstrates bit reversal of ten bits using 10 to store the address to be reversed and M0 to store the 
modify value. 

First, we determine the first bit-reversed address. This address is the first 14-bit address 
with the ten least significant bits reversed. For the DIT FFT subroutine, the first address in the 
inplacereal buffer is 0x0000. If we reverse the ten least significant bits of 0x0000, we still have 
0x0000. Thus, we want to output 0x0000 as the first bit-reversed address. To do so, 10 must be 
initialized to the number that, when bit-reversed by the ADSP-2101 (all 14 bits), is 0x0000. In 
this case, that number is also 0x0000. 

The second bit-reversed address must be 0x0200 (0x0001 with ten least significant bits 
reversed). We must modify 10 to the value that, when bit-reversed (all 14 bits) is 0x0200. This 
value is 0x0010. Since 10 contains 0x0000, we must add 0x0010 to it. Thus, 0x0010 is loaded 
into M0. After the first data memory read or write, which outputs 0x0000, M0 is added to the 
(non-bit-reversed) address in 10 so that 10 contains 0x0010. On the second data memory read 
or write, 10 is bit-reversed (14 bits) and the resulting address is 0x0200, the correct second 
bit-reversed address. 

In general, the modify value is determined by raising two to the difference between 14 
and the number of bits to be reversed. In this ten-bit example, the value is 2 <l4 ,0) = 0x0010. 
Adding this value to 10 after each memory access and reversing all 14 bits on the next memory 
access yields the correct bit-reversed addresses for ten bits. The first four bit-reversed addresses 
are shown below. 

Sequence 10, Non-Bit-Reversed 10, Bit-Reversed 

Hex Binary Hex Binary 

0000 00 0000 0000 0000 0000 00 0000 0000 0000 

1 0010 00 0000 0001 0000 0200 00 0010 0000 0000 

2 0020 00 0000 0010 0000 0100 00 00010000 0000 

3 0030 00 0000 0011 0000 0300 00 00110000 0000 
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Only the ten least significant bits (boldface) are bit-reversed. Each time a data memory write 
is performed, 10 is modified by MO. Note that the modified 10 value is not bit-reversed. Bit 
reversal only occurs when a data memory read or write is executed. 

Listings 8-8 shows the scramble routine which places the inputs to the E'lT FFT in bit- 
reversed order. This module begins by initializing two constants. The first constant (AO is the 
number of input points in the FFT. The second constant (mod_value) is the modify value for 
the pointer which outputs the bit-reversed addresses. Pointers to the data buffers are initialized, 
and the bit-reverser is enabled for DAG1. In bit-reverse mode, any addresses output from 
registers 10, II, 12, or 13 will be bit-reversed. 10 is used in scramble. 

The scramble routine assumes real input data. In this case, the imaginary data is all zeros 
and can be initialized directly into the inplaceimag buffer. The brev loop consists of two 
instructions. First, the sequentially ordered data is read from the input real buffer using 14 (from 
DAG2). Then, the same data is written to the bit-reversed location in the inplacereal buffer 
using 10 (from DAG1). After all the real input data has been placed in bit-reversed order in the 
inplacereal buffer, the bit-reverser is disabled for the rest of the FFT calculation. 







__ 



{ Bit-Reverse (Scramble) Subroutine 

Calling Parameters 

Sequentially ordered input data in inputreal 

Return Values 

Scrambled input data in inplacereal 

Altered Registers 
10, I4,M0,M4,AY1 

Altered Memory 
inplacereal 



DIT BREV. DSP 



. MODULE 

. CONST 

. EXTERNAL 

. CONST 

. EXTERNAL 

.ENTRY 

scramble : 



brev : 



dit_scramble; 
N=1024,mod_value=0x0 010; 
inputreal, inplacereal; 
N=1024,mod_value=0x0010; 
inputreal, inplacereal; 
scramble; 



{Initialize constants} 
(Initialize constants} 



I4="inputreal; 
I 0= A inplacereal ; 
M4=l; 

MO=mod_value; 

L4=0; 

L0=0; 

CNTR = N; 

ENA B I T_REV ; 

DO brev UNTIL CE; 

AY1=DM(I4,M4) ; 

DM(I0,M0) =AY1; 

DIS BIT_REV; 

RTS; 



{ I4sequentially ordered data} 
(IOscrambled data} 

(M0=modifier for reversing N bits} 



(Enable bit-reversed outputs on DA31 } 

(Read sequentially ordered data} 
(Write data in bit-reversed location} 
(Disable bit-reverse} 
(Return to calling program} 



. ENDMOD; 



Listing 8-8: Bit-Reverse (Scramble) Routine 
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Block Floating-Point Scaling 

Block floating-point scaling is used to maximize the dynamic range of a fixed point data field. 
The block floating-point system is a hybrid between fixed-point and floating-point systems. 
Instead of each data word having its own exponent, the block floating-point format assumes the 
same exponent for an entire block of data. 

The initial input data contains enough guard bits to ensure that no overflow occurs in the 
first FFT stage. During each stage of the FFT calculation, bit growth can occur This bit growth 
can result in magnitude bits replacing guard bits. Because the stage output date, is used as input 
data for the next stage, these guard bits must be replaced; otherwise, an output of the next stage 
might overflow. In a conditional block floating-point FFT, bit growth is monito-ed in each stage 
calculation. When the stage is complete, the output data of the entire stage is s lifted to replace 
any lost guard bits. 

Because a radix-2 butterfly calculation has the potential for two bits of growth, SB (the 
block floating-point exponent register) is initialized to -2. This sets up the ADSP-2101 block 
floating-point compare logic to detect any data with more than 1 3 bits of magnitude (or fewer 
than three sign bits). After each butterfly calculation, the EXPADJ instruction determines if bit 
growth occurred by checking the number of guard bits. For example, the value 1111 0000 0000 
0000 has an exponent of -3. The value 0111 1111 1111 1111 has an exponent of zero (no guard 
bits). If a butterfly result has an exponent larger than the value in SB, bit growth into the guard 
bits has occurred, and SB is assigned the larger exponent (if it has not already been changed by 
bit growth in a previous butterfly of the same stage). Therefore, at the end of each stage, SB 
contains the exponent of the largest butterfly result(s). If no bit growth occurred, SB is not 
changed. 

The dit_radix-2_bfp_adjust routine is shown in Listing 8-9. This routine performs block 
floating-point scaling on the outputs of each stage except the last of the DIT FFT. Because 
guard bits only need to be replaced to ensure that an output of the next stage does not overflow, 
the subroutine first checks to see if the block of data is the output of the last stage. If it is, no 
shifting is needed and the subroutine returns. If the data block is not the output of the last stage, 
shifting is necessary only if SB is not -2 (indicating that bit growth into guard bits occurred). 
If SB is -2, no bit growth occurred, so the subroutine returns. 

If bit growth occurred, shifting is needed. The subroutine determines the amount to shift 
from the value of SB. The data can grow by either one or two bits for each stage; therefore, if 
bit growth occurred, SB must be either - 1 or zero. If SB is - 1 , the data block is s lifted right one 
bit. If SB is not -1, it must be zero. In this case, the data block is shifted right two bits. When 
shifting is complete, the block exponent is updated by the shifted amount (one or two). 

In this routine, shifting to the right is performed through multiplication nther than shift 
instructions. Multiplication by an appropriate power of two gives a shifted result. For example, 
to shift a number two bits to the right, the number is multiplied by 0x0200. In multiplication, 
the product can be rounded to preserve LSB information, whereas in shifting, this information 
is merely lost. Multiplication thus minimizes noise. 
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( DIT Radix-2 Block Floating-Point Scaling Routine 



DIT_BFPA.DSP 



} 



Calling Parameters 

Radix-2 DIT FFT stage results in inplacereal and inplaceimag 
Note: This code assumes inplaceimag immediately follows 
inplacereal in memory. 

Return Parameters 

inplacereal and inplaceimag adjusted for bit growth 

Altered Registers 

10,11, AXO, AYO , AR, MXO , MYO , MR, CNTR 

Altered Memory 

inplacereal, inplaceimag, blk_exponent 



.MODULE dit_radix_2_bfp_adjust; 
.CONST Ntimes2 = 2048; 

.EXTERNAL inplacereal, blk_exponent ; {Begin declaration section} 



. ENTRY bfp_adj; 



bfp_adj : 



strt shift: 



shift_loop: 



AY0=CNTR; 

AR=AY0-1 

IF EQ RTS; 

AY0=-2; 

AX0=SB; 

AR=AX0-AY0; 

IF EQ RTS; 

I 0=" inplacereal; 

11= "inplacereal ; 

AY0=-1; 

MY0=0x4000; 

AR=AX0-AY0,MX0=DM(I0,M1) ; 
IF EQ JUMP strt_shift; 
AX0=-2; 
MY0=0x2000; 
CNTR=Ntimes2 - 1; 
DO shift_loop UNTIL CE; 
MR=MX0*MY0 (RND) ,MX0=DM 

DM(I1,M1)=MR1; 
MR=MX0*MY0 (RND) ; 
AYO=DM(blk_exponent) ; 
DM(I1,M1) =MR1,AR=AY0-AX0; 
DM (blk_exponent) =AR; 
RTS; 



{Check for last stage) 
{If last stage, return} 



{Check for SB=-2 } 

{IF SB=-2, no bit growth, return) 
{I0=read pointer) 
{Il=write pointer) 



Set MYO to shift 1 bit right) 

{Check if SB=-1; Get first sample) 
If SB=-1, shift block data 1 bit) 
Set AXO for block exponent update) 
Set MYO to shift 2 bits right) 
initialize loop counter) 
Shift block of data) 
10, Ml) ; 

MR=shifted data, MX0=next value) 
Unshifted data=shifted data) 
Shift last data word) 
Update block exponent and] 
{store last shifted sample) 



Listing 8-9: DIT Radix-2 Block Floating-Point Scaling Routine - 



8.3.3 Exercises 

1. The DIT FFT modules described in this section perform a 1024-pcint DFT. To 
implement a different size FFT requires change in the values of the constants N and 
N_div_2 in Listing 8- 1 . Similarly a new set of twiddle factors Mid-real £.nd twidjmag 
must be computed and stored in data files. Modify the DIT_FFT_M'AIN program 
(and its related modules) to compute an 8-point DFT depicted in Figure 8-2. 
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Determine the proper twiddle factors and store their real and imaginary parts in 
data files. Create a data file containing 8 samples of the signal 

Tin 

x(n) = cos-— , h=0,1,...,7 
4 

Use the Simulator to compute and verify its DFT. 

2. The radix-2 DIT FFT implementation described in this section is t ie most compact 
form of the FFT, in terms of the program memory storage requirements. However, 
it is not the most efficient in terms of the speed of its execution. Irs execution is in 
a fully looped form which does not exploit the unique mathematical characteristics 
of the first and the last stage of the FFT. Specifically, all the multiplic ations (equations 
(8-7) through (8-10)) in the first stage are by a value of either or 1 and therefore 
can be removed. Can you write an improved DIT FFT program and its modules to 
incorporate the above enhancement? A solution program is available on the diskette. 

8.4 RADIX-2 DECIMA TION-IN-FREQUENCY FFT 

In the DIT FFT, each decimation consists of two steps. First, a DFT equation is expressed as 
the sum of two DFTs, one of even samples and one of odd samples. This equation is then divided 
into two equations, one that computes the first half of the output (frequency) samples and one 
that computes the second half. In the decimation-in-frequency (DIF) FFT, a DFT equation is 
expressed as the sum of two calculations, one on the first half of the samples and one on the 
second half of the samples. This equation is then expressed as two equations, one that computes 
even output samples and one that computes odd output samples. Decimation in time refers to 
grouping the input sequence into even and odd samples, whereas decimation in frequency refers 
to grouping the output (frequency) sequence into even and odd samples. Decimation-in- 
frequency can thus be visualized as repeatedly dividing the output sequence ir to even and odd 
samples in the same way that decimation in time divides down the input sequence (Proakis and 
Manolakis, [8]). 

8.4. 1 The Algorithm 

The DIF FFT divides an /V-point DFT into two summations, shown in (8-1 1). 

X{k) = N ix(n)W" N k 

S-i 



I" 1 I" 1 

I x(n)W" N k + I x(n +N/2)W'; +NI2)k 

(8-11) 



Sec. 8.4 RADIX-2 DECI M ATION-IN-FREQUENC Y FFT 



233 



Because W%* NI2)k = W" N k ■ W^and W$ "* = (-1)*, equation (8-11) can also be expressed as 



- i 



X(k) = X x(n)W" N k + (-1)* I x(n + N/2)W" N k 

"=° " =0 

= l[x(n) + (-l) k x(n+N/2)]W" N k , k=0,...,N-l (8-12) 

The decimation of the output (frequency) sequence is accomplished by dividing X(k) into 
two equations, one that computes even output samples and one that computes odd output 
samples. For even values of X(k), k=2r. 



X(2r) = I [x(n) + {-\fx{n +NI2)} W 2 N " r 

n=0 



N 

= 2Z[x(n)+x(n+N/2)]W" N ' l2 , r=0,...,--l (8-13) 

n =0 I 

For odd values of X(k), k=2r+l. 

X(2r + l) = -L[x(n) + (-\) 2r + > x(n+N/2)]W% r + >) " 

n =0 

= 2 l[[x(n)-x(n+N/2)]W$WZ 2 ,r = 0,...,^-\ (8-14) 

Note that X(2r) and X(2r+ 1 ) are the results of N/2 -point DFTs performed on the sum and 
difference of the first and second halves of the input sequence. In equation (8-14 1, the difference 
of the two halves of the input sequence is multiplied by a twiddle factor, W^. Figure 8-5 illustrates 
the first decimation of the DIF FFT, which eliminates half (N 2 /2) of the DFT calculations. 

Each of the two /V/2-point DFTs (X(2r) and X(2r+ 1 )) are divided into two W/4-point DFTs 
in the same way as the N-point DFT is divided into two N/2-point DFTs. By the substitutions 



N 

x,(n) = x(n)+x(n+N/2) «=0,..., — -1 



the sequence of even samples in equation (8-13) becomes 

X,(r) = I (8-15) 
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Figure 8-5: First Decimation of DIF FFT 

This N/2-point sequence has the same form as the original N-point sequence in equation (8-11) 



and can be divided in half in the same manner to yield 



X,(r) = I[x 1 (n) + (-l)'x 1 («+yV/4)]C /2 

n =0 



(8-16) 



For even output samples, let r=1 



X, (2s ) = X [x, (n ) + x, (« + N/4)] W™ 4 

n =0 

For odd output samples, let r=2s+ 1 . 



X,(2j + 1) = L[(x,(n)-x l {n+NIA))W 2 N ''W^ 



(8-17) 



(8-18) 



X(2r+\) is also divided into two equations, one that computes even output samples and 
one that computes odd output samples, in the same way that X(2r) is divided into X t (2s) and 
X|C2s+l). Thus we have four N/4-point sequences. 

If we make the substitutions 

X 2 (s) = X t (2s) 

x 2 (n) = x x (n)+x^n +N/4) 
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equation (8-17) becomes 



x 2 (s) = i^f; 



S-19) 



The four /V/4-point sequences that result from the decimation of X(2r ) and X(2r+ 1) are 
divided to form eight A78-point sequences in the third decimation. This process is repeated until 
the division of a sequence results in a pair of equations that together compute a two-point DFT. 
In this pair, the summation variable n (see equations (8-17) and (8-18)) is equal to zero only, 
so no summation is performed. The two-point DFT computed by this pair of equations is the 
core calculation (butterfly) for the radix-2 DIF FFT. 

Figure 8-6 shows the complete decimation for an eight-point DIF FFT. Notice that the 
inputs of the DIF FFT are in sequential order and the outputs are in scrambled order. The DIF 
FFT can also be performed with inputs in bit-reversed order, resulting in outputs in sequential 
order. In this case, however, the twiddle factors must be in bit-reversed order. In this chapter, 
the DIF FFT is presented with twiddle factors in sequential order to simplify programming. 




Stage 3 



Figure 8-6: Eight-Point DIF FFT 



As in the DIT FFT, the 8-point DIF FFT butterflies are organized into groups and stages. 
In the eight-point FFT, the first stage has one group of four (A72) butterflies. The second stage 
has two groups of two (N/4) butterflies, and the last has four groups of one butterfly. 

The DIF FFT butterfly is similar to that of the DIT FFT except that the twiddle factor 
multiplication occurs after rather than before the primary-node and dual-node subtraction. The 
DIF butterfly is illustrated graphically in Figure 8-7. The variables x and y represent the real 
and imaginary parts, respectively, of a sample. The twiddle factor can be divideo into real and 



imaginary parts because jy = e -W N _ cos (2n/N) -j sin(2n/N)- ln tne program presented later 
in this section, the twiddle factors are initialized in memory as cosine and -sine values (not 
+sine). For this reason, the twiddle factors are shown in Figure 8-7 as C + j(-S). C represents 
cosine and -S represents -sine. 
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Equations (8-20) i 



Figure 8-7: Radix-2 DIF FFT Butterfly 



(8-23) describe the DIF FFT 1 



outputs. 



*\ = C(x -x l )-(-S)(y Q -y l ) 
y] = {S)(x -x ] ) + C(y -y l ) 



(8-20) 
(8-21) 
(8 - 22) 
(8-23) 



As in the DIT FFT, the butterfly is performed in-place; that is, the result i of each butterfly 
are written over the corresponding inputs. For example, jc is written over x . 

8.4.2 A DIF FFT Program 

The DIF flow chart is shown in Figure 8-8. Like the DIT FFT, the DIF FFT uses three 
subroutines. The first subroutine computes the FFT. The second subroutine performs conditional 
block floating-point scaling at the end of each stage (except the last). The third subroutine 
bit-reverses the locations of the FFT output data to "unscramble" the data. The DIF FFT sub- 
routine is described in this section.The block floating-point and bit reversal rout ines are described 
later in this section. 

Main Module 

The module difjftmain is shown in Listing 8-10. The FFT calculation is performed in 
one buffer (inplacedata). In this program, the real and imaginary input data are interleaved in 
the buffer. The length of inplacedata is thus twice the number of points in the FFT and is 
specified by the constant N_x_2 (N_x_2 = 2048 for a 1024-point FFT). Unlike the DIT FFT, 
the DIF FFT is performed on sequentially ordered input data and produces data in bit-reversed 
order; therefore, no additional buffers for scrambling the input data are needed. 
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Figure 8-8: Radix-2 DIF FFT Flow Chart 
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When the output data is unscrambled, it is separated into real and imaginary values and 
placed in two buffers (realjesults, imaginary results). Twiddle-factor buffers are defined and 
initialized as in the DIT FFT. 

The DIF FFT uses the variables groups, bflys _per group and blk exponent. Because the 
first stage of the DIF FFT contains one group of N/2 butterflies, groups is initialized to one and 
bflys _per group is initialized to N_div_2. The node spacing (node space) is V instead of N/2 
because the real and imaginary input data are interleaved. 

Two subroutines are called. The first performs the DIF FFT and calls the block 
floating-point scaling routine. The second bit-reverses the FFT outputs to unscramble them. 





{ DIF FFT Main program 



- 




DIF MAIN. DSP 



This program computes a 1024-point DFT of data values stored in the 
buffer inplacedata using DIF FFT algorithm. The input data is assumed 
to be complex. The real and imaginary input data are interleaved in 
the buffer. The input data is read from the disk file UNPLACED . DAT . 
The real and imaginary parts of twiddle factors are read from disk 
files TWID_REA.DAT and TWID_IMA.DAT, respectively. The cutput DFT 
values are available in data memory locations real_results and imag_ 
results . 



.M0DULE/ABS=4 


dif fft main; 




. CONST 


N=1024,N x 2=2048; 


{Const, for 1024 points} 


.CONST 


N_div_2=512, log 2 N=10; 




. VAR/DM/RAM 


inplacedata [N x 2 ] ; 




. VAR/DM/RAM 


real results [N] ; 




. VAR/DM/RAM 


imaginary results [N] ; 




.VAR/PM/ROM/CIRC 


twid imag[N div 2]; 




.VAR/PM/ROM/CIRC 


twid_real [N_div_2] ; 




.VAR/DM/RAM 


groups, node space, bflys per group, blk exponent; 


. INIT 


inplacedata: <inplacedata . dat>; 


. INIT 


twid imag: <twid imag 


dat>; 


. INIT 


twid real: <twid real 


dat>; 


. INIT 


groups: 1; 




. INIT 


node space: N; 




. INIT 


bflys per group: N div 2; 


. INIT 


blk exponent: 0; 




. GLOBAL 


inplacedata, real results, imaginary results; 


. GLOBAL 


twid real, twid imag; 




. GLOBAL 


groups, bf lys_per_group, node_space, blk_e;:ponent ; 


.EXTERNAL 


unscramble, fft_start; 




CALL fft_start; 






CALL unscramble; 






TRAP; 





. ENDMOD; 



Listing 8-10: Main Module, Radix-2 DIF FFT - 
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DIF FFT Module 

The conditional block floating-point DIF FFT program is described in this section. The butterfly 
loop is described first, then the group and stage loops. The complete FFT program is presented 
in Listing 8-14 at the end of t 



Butterfly Loop 

The code segment for the DIF butterfly (with conditional block floating-poi it scaling) is 
shown in Listing 8- 1 1 , on the next page. The primary-node outputs x and y are calculated first 
and written over x and y . Complex subtraction for the dual-node calculation is then performed, 
followed by the twiddle factor multiplication. The outputs x, and y x are written over x, and _y, . 
Instructions that write butterfly results to memory are boldface. Each butterfly output is checked 
for bit growth using the EXPADJ instruction. This loop is repeated bflys _per group times. 

The input and output parameters for the butterfly loop are as follows: 



Initial Conditions 

AXO = x 
AYO = x, 
AYl=j, 

10 ->J> 

11 -» nextx, 

12 -* x, 

14 -»C 

15 -> (-S) 
M0 = -1 
Ml = 1 

M5 = twiddle factor modifier 
CNTR = butterfly i 



Final Conditions 

AXO = next x 
AYO = nextx, 
AY 1 = next y, 

10 — > next _y„ 

11 — > Jf, after next 

12 -> nextx, 

CNTR = butterfly count -1 



AR=AX0+AY0,AX1=DM(I0,M0) , MY0=PM ( 14 , M5 ) ; ( AR=xO+xl , AXl=y , MY0=C, ] xO} 
SB=EXPADJ AR; {Check for bit growth} 

DM(I0,M1)=AR,AR=AX1+AY1; {xO '=x0+xl , AR=yO+y 1 , 10 yO } 

SB=EXPADJ AR; {Check for bit growth} 

DM(I0,M1)=AR,AR=AX0-AY0; { yO '=y0+yl , AR=xO-xl , 10 next xO} 

MX0=AR, AR=AX1 -AY1 ; (MX0=x0-xl , AR=y0-yl } 

MR=MX0*MY0 (SS) , AX0=DM ( 1 , Ml ) ,MY1=PM ( 15, M5 ) ; 

{MR=(x0-xl)C,AX0=next x0,MYl=(-S) } 



MR=MR-AR*MY1 (RND) , AY0=DM ( I 1 , Ml ) ; 
SB=EXPADJ MR1; 

DM(I2,M1)=MR1,MR=AR*MY0 (SS); 
MR=MR+MX * MY 1 (RND) , AY1=DM ( 1 1 , Ml ) , 
DM(I2,M1)=MR1, SB=EXPADJ MR1 ; 

Listing 8-11: 



{MR=(x0-xl)C-(y0-yl) (-S) , } 

{ AY0=next xl} 

(Check for bit growth} 

(xl '= (xO-xl)C- (yO-yl) (-S) , \ 

( MR=(y0-yl)C} 

(MR=(y0-yl)C+(x0-xl) (-S) , } 

{ AYl=next yl} 

(yl '=(y0-yl)C+(x0-xl) (-S) , check} 
{for bit growth} 

FFT Butterfly, Conditional Block Floating-Point 



Group Loop 

The group loop code is shown in Listing 8-12. The group loop sets up the butterfly loop by 
fetching initial data and initializing the butterfly loop counter. When all the butterflies in a group 
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have been calculated, data pointers are updated to point to the inputs for the first butterfly of 
the next group. This loop is repeated until all groups in a stage are complete. The input and 
output parameters of the group loop are as follows: 

Initial Conditions 

10 — > x of first butterfly in group 

11 — > x, of first butterfly in group 

12 -> x, of first butterfly in group 
CNTR = group count 
Ml = 1 

M2 = node _space 
M3 = node _space-2 



I Conditions 

10 -» x of first butterfly in next group 

11 -» x, of first butterfly in next group 

12 — » x, of first butterfly in next group 
CNTR = group count -1 



CNTR=DM (bf lys_per_group) ; {Initialize butterfly counter} 

AX0=DM(I0,M1) ; {AXO=xO} 
AY0=DM(I1,M1) ; {AY0=xl} 
AY1=DM(I1,M1) ; {AYl=yl } 

DO bfly_loop UNTIL CE; 
bfly_loop: {Calculate All Butterflies in Group} 

MODIFY (12, M2) ; {12 ->xl of 1st butterfly in next group) 

MODIFY (II, M3) ; {II ->xl of 1st butterfly in next group} 

MODIFY (10, M3) ; 

group_loop : MODIFY (10, Ml) ; {10 ->x0 of 1st butterfly in next group) 

Listing 8-12: Radix-2 DIF FFT Group Loop 

Stage Loop 

The stage loop code is shown in Listing 8-13. This code segment sets up and computes all groups 
in a stage and controls stage characteristics, such as the number of groups in a stage. Pointers 
10 and II are set to point to x and x, of the first butterfly in the first group of the stage. Pointer 
12 also points to x, and is used to write the dual-node butterfly results to dati memory. M3 is 
set to node_space-2 and is used to modify pointers for the next group. The group counter is 
initialized to groups, the number of groups in the stage. The twiddle factor modifier stored in 
M5 is also groups. This value is the exponent increment value for the twiddle factors of con- 
secutive butterflies in a group. 

The SB register is set to -2 to detect any bit growth into the guard bits, of any butterfly 
output. When all the groups in a stage are computed, the bfp adjustment routine is called to 
check for bit growth and adjust the output data if necessary. Then parameters For the next stage 
are updated; groups is doubled and node_space and bflys _per group are divided in half. The 
stage loop is repeated log 2 N times. 

The input and output parameters of the stage loop are summarized below. Note that the 
parameters are passed in memory locations. 

Initial Conditions 

groups = # groups in stage 
bflys j>er group = # butterflies/group 

node space = node spacing current stage 

inplacedata = stage input data 



Final Conditions 

groups = # groups in next stage 

bflys _per group = #butterflies/ 

group (next stage) 

node space = node spacing next 

stage 

inplacedata = stage output data 
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IO=~inplacedata; {10 ->x0 in 1st butterfly of stage) 

11= "inplacedata; 
AY0=DM (node_space) ; 

M2=AY0; {M2=dual node spacing} 

MODIFY (II, M2) ; {II ->xl in 1st butterfly of stage} 

12=11; 

AX0=2; 

AR=AY0-AX0; 

M3=AR; {M3=node_space-2 } 

CNTR=DM (groups) ; {Initialize group counter) 

SB=-2; {Set minimum allowable sign bits to two} 

M5=DM (groups) ; {M5=twiddle factor modifier) 

DO group_loop UNTIL CE; 

group_loop: {Calculate All Groups in Stage) 

CALL bfp_adjustment ; {Adjust block data for bit growth) 

SI=DM (groups) ; 
SR=LSHIFT SI BY 1 (LO) ; 

DM (groups) =SR0; { groups=groups X 2} 

SI=DM (node_space) ; 
SR=LSHIFT SI BY -1 (LO) ; 

DM (node_space) =SR0; { node_space=node_space * 2} 
SR=LSHIFT SRO BY -l(LO); 

DM (bf lys_per_group) =SR0; { bflys_per_group=bflys_per group + 2} 

Listing 8-13: Radix-2 DIF FFT Stage Loop 



DIF FFT Subroutine 

The complete block floating-point DIF FFT subroutine is shown in Listing 8-14. Initializations 
of index, modifier and length registers that retain the same values throughout the FFT calculation 
are performed before the stage loop is entered. Instructions that write butterfly results to memory 
are boldface. 

{ Radix-2 DIF FFT Subroutine R2DIFFFT . DSP 

Performs Radix-2 DIF FFT 

Calling Parameters 

inplacedata = Real input data in sequential order 

twid_real = Twiddle factor cosine values 

twid_imag = Twiddle factor sine values 

groups = N/2 

bf lys_per_group = 1 

node_space = 1 

Return Values 

inplacedata = Real and Imaginary FFT results interleaved in 
sequential order 

. MODULE dif_fft; 

. CONST N=1024, N_div_2=512, logN=10; 

.EXTERNAL inplacedata, twid_real, twid_imag; 
. EXTERNAL groups , bf lys_per_group, node_space; 
. EXTERNAL bfp_adjust; 

. ENTRY fft_start; 

fft_start: I4="twid_real; (14 -> C OF W»} 
L4=N_div_2; 

I5="twid_imag; {15 -> (-S) OF W 3 } 
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{Initialize stage counter} 



{10 -> x0} 



bf ly_loop 



group_loop : 



L5=N_div_2; 
M0=-1; 
Ml = l; 

CNTR=logN ; 
L0=0; 
L1 = 0; 
L2=0; 

DO stage_loop UNTIL CE; 
10=^inplacedata; 
I l=~inplacedata; 
AY0=DM (node_space) ; 
M2=AY0; 

MODIFY (II, M2) ; 
12=11; 
AX0=2 ; 
AR=AY0-AX0; 
M3=AR; 

CNTR=DM (groups) , 
SB=-2; 
M5=CNTR; 

DO group_loop UNTIL CE; 

CNTR=DM (bf lys_per_group) 
AX0=DM(I0,M1) ; 
AY0=DM(I1,M1) ; 
AY1=DM (II, Ml) ; 
DO bfly_loop UNTIL CE; 

AR=AX0+AY0, AX1=DM(I0,M0) , MY0=PM ( 14 , M5 ) ; 

{AR=xO+xl, AXl=y0,MY0=C) 
SB=EXPADJ AR; {Check for bit growth) 

DM(I0,M1)=AR, AR=AX1+AY1; {x0 '=x0+xl, AR=/0+yl ) 

SB=EXPADJ AR; {Check for bit growth} 

DM(I0,M1)=AR, AR=AX0-AY0; {y0 '=y0+y 1 , AR=xO-xl } 

MX0=AR, AR=AX1-AY1; { MX0=x0-xl , AR=y 0-y 1 } 
MR=MX0*MY0 (SS), AX0=DM ( 1 , Ml ) , MY1=PM ( 1 5 , M5 ) ; 

{MR=(xO-xl)C,AXO=next xO,MYl=(-S) } 
MR=MR-AR*MY1 (RND), AY0=DM ( I 1 , Ml ) ; 

{MR= (xO-xl) C- (yO-yl) (-S) , AY0=next xl) 
SB=EXPADJ MR1; {Check for bit growth} 

DM(I2,M1)=MR1, MR=AR*MY0 (SS); 

{xl '=(x0-xl)C-(y0-yD (-S) , MR= (yO-yl ) C ) 
MR=MR+MX0*MY1 (RND), AY1 =DM ( I 1 , Ml ) ; 

{MR=(yO-yl)C+(xO-xl) (-S) , AYl=next yl } 
DM(I2,M1)=MR1, SB=EXPADJ MR1; 

{yl '= (yO-yl) C+ (xO-xl) (-S) , check tit growth} 



{II -> xl} 



{M3=node_space-2 } 
{Initialize group counter} 

{Init. twiddle factor modifier} 



(Init. butterfly counter} 

{AXO=x0} 

{AY0=xl} 

{AYl=yl} 



MODIFY(I2,M2) 
MODIFY (II, M3) 
MODIFY(I0,M3) ; 
MODIFY (10, Ml) ; 

CALL bfp_adjust; 

SI=DM (groups) ; 

SR=LSHIFT SI BY 1 

DM (groups) =SR0; 

SI=DM (node_space) ; 

SR=LSHIFT SI BY -1 



{12->xl of first butterfly in next group} 
{Il->xl of first butterfly in next group} 

{10->x0 of first butterfly in next group} 
{Adjust block data for bit growth} 

(LO) ; 

{groups=groups x 2} 
(LO) ; 

{ node_space=node_space -«■ 2} 



stage_loop: 



DM (node_space) =SR0; 
SR=LSHIFT SRO BY -1 (LO) ; 
DM (bf lys_per_group) =SR0 ; 

{bf lys_per_group=bf lys_per_group + 2} 



RTS; 



.ENDMOD; 



Listing 8-14: Radix-2 DIF FFT Routine, Conditional Block Floating-Point • 
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Bit Reversal 

As described earlier, the output sequence of the radix-2 DIF FFT is in a bit reversed order which 
must be unscrambled after the FFT is performed. This unscrambling is also accomplished 
through bit reversal. The basic scheme and code of bit reversal algorithm is described in Section 

8.3 

Listings 8- 1 5 shows the unscramble routine which places the output data of the DIF FFT 
in sequential order. The module begins by initializing two constants. The first constant (AO is 
the number of input points in the FFT. The second constant (modvalue) is the modify value 
for the pointer which outputs the bit-reversed addresses. Pointers to the data buffers are ini- 
, and the bit-reverser is enabled for DAG 1 . In bit-reverse mode, any add:-esses output 
registers 10, II, 12, or 13 will be bit-reversed. The II register is used in unscramble. 

The unscramble routine uses two loops: one to unscramble the real FFT output data, the 
other to unscramble the imaginary output data. 14 points to the first of the scrambled real data 
values in the inplacedata buffer. 14 is modified by two (in M4) after each read. Because the real 
and imaginary data in inplacedata are interleaved, this ensures that only real data is read for the 
first loop. II contains the (bit-reversed) address of the first location in the realj esults buffer 
(for unscrambled real data). The appropriate modify value (stored in Ml) is added to II upon 
each data memory write. Before entering the second loop, 10 is updated to point to the first 
imaginary data in inplacedata and 1 1 is set to the first address (bit-reversed) of the imag results 
buffer (for sequentially ordered imaginary data). 



( Bit-Reverse (Uncramble) Subroutine 



DIF_BREV.DSP 



) 



Calling Parameters 

Real and imaginary scrambled output data in inplacedata 

Return Values 

Sequentially ordered real output data in real_results 
Sequentially ordered imag . output data in imaginary_results 

Altered Registers 

10, II, 14, Ml, M4, AY1 , CNTR 

Altered Memory 

real_results, imaginary_results 



.MODULE dif_unscramble; 

.CONST N=1024,mod_value=0x0010; (Initialize constants} 

.EXTERNAL inplacedata; 

.ENTRY unscramble; (Declare entry point into module} 



unscramble: I4= A inplacedata; 
M4=2; 
LOO; 
L4=0; 
11=0x4; 
Ml=mod_value ; 
CNTR=N; 
ENA BIT REV; 



DO bit_rev_real UNTIL CE; 

AY1=DM(I4,M4) ; (Read real data} 



(I4real part of 1st data point- 
(Modify by 2 to fetch only rea„ data} 



{Il = lst real output addr, bit-::eversed} 
(Modifier for 10-bit reversal} 
(N=number of real data points} 
(Enable bit-reverse} 
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bit_rev_real: DM ( I 1 , Ml ) =AY 1 ; 

I4 = "inplacedata + 1 ; 

Il=OxC; 

CNTR=N; 



DO bit_rev_imag UNTIL CE; 



{Place in sequential order) 
{I4imag. part of 1st data point} 
{Il=lst imag. output addr, bit-reversed} 
(N=number of imaginary data points} 



AY1=DM(I4,M4) 
bit_rev_imag: DM ( I 1 , Ml ) =AY1 ; 

DIS BIT_REV; 
RTS; 



(Read imag. data} 

(Place in sequential order} 

(Disable bit-reverse} 



. ENDMOD; 



Listing 8-15: Bit-Reverse (Unscramble) Routine - 



Block Floating-Point Scaling 

In Section 8.3, we discussed the block floating-point scaling algorithm used in the DIT FFT 
subroutine. The scaling is used to maximize the dynamic range of a fixed-point data field. This 
block floating-point routine, dit_radix-2_bfp adjust, can be modified for the DIF FFT routine 
by replacing inplacereal references with inplacedata. The modified routine, difradix- 
2 bfp adjust, is shown in Listing 8-16. It performs block floating-point scaling on the outputs 
of each stage except the last of the DIF FFT. 



( DIF Radix-2 Block Floating-Point Scaling Routine 



DIF BFPA.DSP 



Calling Parameters 

Radix-2 DIF FFT stage results in inplacedata 



Return Parameters 

inplacedata ac 



3ted for bit growth 



} 



Altered Registers 

1 , 1 1 , AXO , AYO , AR, MXO , MYO , MR, CNTR 

Altered Memory 

inplacedata, blk_exponent 



.MODULE dif_radix_2_bfp_adjust; 
. CONST Ntimes2 = 204 8; 

.EXTERNAL inplacedata, blk_exponent ; (Begin declaration section} 



.ENTRY bfp_adjust; 

bfp_adjust: AY0=CNTR; 

AR=AY0-1 
IF EQ RTS; 
AY0=-2; 
AX0=SB; 
AR=AX0-AY0; 
IF EQ RTS; 
I 0=~ inplacedata ; 
I 1=' inplacedata ; 
AYO— 1; 
MY0=0x4 000; 

AR=AX0-AY0 , MX0=DM ( I , Ml 
IF EQ JUMP strt_shift; 
AX0=-2; 
MY0=0x2000; 
st-rt shift: CNTR=Ntimes2 - 1; 

DO shift_loop UNTIL CE; 



(Check for last stage) 
(If last stage, return} 



(Check for SB=-2} 

(IF SB=-2, no bit growth, return} 
(I0=read pointer} 
(Il=write pointer} 



shift_loop: 



(Set MYO to shift 1 bit right) 

(Check if SB=-1; Get first sample) 
(If SB=-1, shift block data 1 bit) 
(Set AXO for block exponent update) 
(Set MYO to shift 2 bits right) 
(initialize loop counter) 
(Shift block of data) 
MR=MX0*MY0 (RND) , MX0=DM ( 10 , Ml ) ; 

(MR=shifted data, MX0=nexi value) 
DM(I1,M1)=MR1; (Unshifted data=shifted iata} 
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MR=MX0*MY0 (RND) ; (Shift last data word} 

AYO=DM(blk_exponent) ; {Update block exponent and) 

DM(I1,M1) =MR1, AR=AY0-AX0; {store last shifted sample) 
DM (blk_exponent ) =AR; 
RTS; 



. ENDMOD ; 



Listing 8-16: DIF Radix-2 Block Floating-Point Scaling Routine 



8.4.3 Exercises 



1. The DIF FFT modules described in this section perform a 1024-point DFT. To 
implement a different size FFT requires change in the values of the constants N, 
N_x_2, N_div_2, and logN in Listing 8-10. Similarly a new set of twiddle factors 
fw 'id-real and twidjmag must be computed and stored in data files. Modify the 



Use the Simulator to compute and verify its DFT. 

2. A radix-2 FFT devides an /V-point sequence successively in half until only two-point 
DFTs remain. Similarly, a radix-4 FFT devides an N-point sequence succesively in 
quarters until only four-point DFTs remain. The four-point DFT is the core calcu- 
lation (butterfly) of the radix-4 FFT. Refer to Proakis and Manolakis [8] for more 
details and the algorithm. Write a main program, a DIF FFT module rad4 _fft, and 
a digit reversal module digit _rev to implement a 1 6-point DFT. A solution is available 
on the diskette. 



8.5 THE INVERSE DFT AND THE IFFT ALGORITHM 



The inverse relationship for obtaining a sequence from its DFT is called the inverse DFT (IDFT). 
It is given by the equation 



Although the FFT algorithms described in the last two sections were presented in the context 
of computing the DFT efficiently, they may also be used in computing the IDFT. 

The only difference between the two transforms is the normalization facto: - l/N and the 
phase sign of the twiddle factor W N . Consequently, an FFT algorithm for computing the DFT 
may be converted into an IFFT algorithm for computing the IDFT by using a reversed (upside 
down) twiddle factor table and by dividing the output of the algorithm by N. 




x(n) = cos— , »=0, 1..-...7 



x(n) = N ix(k)w;" k 

k=0 



n =0,...,N-l 



(8-24) 
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8.6 SUMMARY 



The implementation of the DFT on the ADSP-2101 using fast algorithms was the topic of this 
chapter. In particular, we developed the decimation-in-time and decimation-in-frequency fast 
Fourier transform algorithms and described detailed procedures to implement these algorithms 
in assembly language. These by no means are the only efficient algorithms available to compute 
the DFT. Several other excellent algorithms are available in the literature. One such algorithm 
is the Radix-4 FFT which is widely used in practice. The discussion provided in this chapter 
should help the user develop programs for this and other algorithms. Furthermore, the programs 
given in this chapter can be used as building blocks for developing programs in spectrum 
analysis, correlation analysis, and frequency domain linear filtering. 




The remaining two chapters describe applications of digital signal processing in digital 
communications and adaptive filtering using programs and routines developed so far. 






chapter 9 



APPLICATIONS IN COMMUNICATIONS 



9.1 INTRODUCTION 

Today, microprocessors and microcomputers find widespread use in the implementation of a 
variety of electronic systems. In this chapter we shall focus on several applications dealing 
with waveform representation and coding, especially speech coding, and with digital commu- 
nications. In particular, we shall describe several methods for digitizing analog waveforms, 
with specific application to speech coding and transmission. These methods are pulse-code 
modulation (PCM), differential PCM and adaptive differential PCM ( ADPCM) delta modulation 
(DM) and adaptive delta modulation (ADM), and linear predictive coding (LPC). An experiment 
is formulated involving each of these waveform encoding methods for implementation on the 
ADSP-2101 family of microcomputers. 

The last three topics treated in this chapter deal with signal detection applications that are 
usually encountered in the implementation of a receiver in a digital communication system. For 
each of these topics we describe an experiment that involves the implementation of the detection 
scheme on an ADSP-2101 microcomputer. 

9.2 PULSE CODE MODULATION 

Pulse code modulation is a method for quantizing an analog signal for the purpose of transmitting 
or storing the signal in digital form. PCM is widely used for speech transmission in telephone 
communications and for telemetry systems that employ radio transmission. We sh all concentrate 
our attention on the application of PCM to speech signal processing. 

Speech signals transmitted over telephone channels are usually limited in bandwidth to 
the frequency range below 4kHz. Hence, the Nyquist rate for sampling such a signal is less 
than 8kHz. In PCM, the analog speech signal is sampled at the nominal rate of 8kHz (samples 
per second) and each sample is quantized to one of 2 h levels, and represented digitally by a 
sequence of b bits. Thus, the bit rate required to transmit the digitized speech signal is 8000.b 
bits per second. 
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The quantization process may be modeled mathematically as 

s(n) = s(n) + q(n) 



(9-1) 



where s(n) represents the quantized value of s(n) and q{n) represents the quantization error 
which we treat as an additive noise. Assuming that a uniform quantizer is used and the number 
of levels is sufficiently large, the quantization noise is well characterized statistically by the 
uniform probability density function, 



Piq) 



1 A A 

— , — <a< — 
A' 2 H 2 



(9-2) 



where the step size of the quantizer is A = 2 h . The mean square value of the q aantization error 

is 



E(q 2 ) 



12 



12 



Measured in decibels, the mean square value of the noise is 



lOloi 



12 



= 10 log — I = -6b- 10.8dB 
V 12 J 



(9-3) 



(9-4) 



We observe that the quantization noise decreases by 6 dB/bit used in the quantizer. High 
quality speech requires a minimum of 12 bits per sample and, hence, a bit rate of 96,000 bits 
per second (bps). 

Speech signals have the characteristic that small signal amplitudes occur more frequently 
than large signal amplitudes. However, a uniform quantizer provides the same spacing between 
successive levels throughout the entire dynamic range of the signal. A better approach is to use 
a nonuniform quantizer which provides more closely spaced levels at the low signal amplitudes 
and more widely spaced levels at the large signal amplitudes. For a nonuniform quantizer with 
b bits, the resulting quantization error has a mean square value that is smaller than that given 
by (9-4). A nonuniform quantizer characteristic is usually obtained by passing the signal through 
a nonlinear device that compresses the signal amplitude, followed by a uniform quantizer. For 
example, a logarithmic compressor employed in U.S. and Canadian telecommunications systems 
has an input-output magnitude characteristic of the form 



log(l+u.|s|) 
log(l+u) 



(9-5) 



where | s | is the magnitude of the input, | jy | is the magnitude of the output, and \i is a parameter 
that is selected to give the desired compression c u - 
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In the encoding of speech waveforms the value of n = 255 has been adopted as a standard 
in the U.S. and Canada. This value results in about a 24dB reduction in the quantization noise 
power relative to uniform quantization. Consequently, an 8-bit quantizer used in conjunction 
with a (I = 255 logarithmic compressor produces the same quality speech as a 12 -bit uniform 
quantizer with no compression. Thus, the compressed PCM speech signal, has a bit rate of 
64,000 bps. 

The logarithmic compressor standard used in European telecommunication systems is 
called A-law and is defined as 



Iv| = 



1 +log(^U| ) 
1 +logA 



(9-6) 



where A is chosen as 87.56. Although (9-5) and (9-6) are different nonlinear functions, the two 
compression characteristics are very similar. Figure 9-1 illustrates these two compression 
functions. We note that they are very similar. 



1 
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In the reconstruction of the signal from the quantized values, the decoder employs an 
inverse logarithmic relation to expand the signal amplitude. The combined compressor- 
expandor pair is termed a compandor. 
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In an implementation of the logarithmic compressor, the logarithmic function is 
approximated by a piecewise linear function. Eight straight line segments along the curve result 
in a close approximation to the logarithmic function. A sample value of the signal is represented 
by its sign (positive or negative), the straight-line segment on the logarithmic approximation 
(three bits to specify the one of eight segments), and the position of the sample on the particular 
line segment. Thus, the |i-law PCM 8-bit word consists of the following three parts: (1) the 
most significant bit (MSB) is the sign bit; (2) the next three bits represent the straight line 
segment number; (3) the last four bits represent the position within the segment. 

PCM Experiment 

The purpose of this experiment is to gain an understanding of the PCM compression 
(linear-to-logarithmic) and the PCM expansion (logarithmic-to- linear) algorithm. 

Use the two software modules, one for u = 255 compression and for \i = 255 expansion 
to study the effect of logarithmic companding. Create data files of different waveforms 
and pass them through the u-law compressor and expander. Display the input and output 
waveforms as indicated in Figure 9-2, and comment on the results. Include an exponential 
waveform and a saw-tooth waveform among the input signals. A waveform generator 
may also be substituted for the waveform data file. If the microphone input is used for 
the input port, modify your program to use only PCM expansion. 
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Figure 9-2: PCM Experiment 

Note that since the compressed PCM code is an 8-bit word, it is shifted (left justified) 
and displayed. Also note that decoding uses full fractional numbers (1.15 format) rather 
than the integer format. 

9.3 DIFFERENTIAL PCM (DPCM) AND ADAPTI VE DPCM 
(ADPCM) 

In PCM each sample of the waveform is encoded independently of all th; other samples. 
However, most signals including speech sampled at the Nyquist rate or faster exhibit significant 
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correlation between successive samples. In other words, the average chang; in amplitude 
between successive samples is relatively small. Consequently an encoding scheme that exploits 
the redundancy in the samples will result in a lower bit rate for the speech signal. 

9.3.1 Differential PCM 

A relatively simple solution is to encode the differences between successive samples rather than 
the samples themselves. Since differences between samples are expected to be smaller than the 
actual sampled amplitudes, fewer bits are required to represent the differences. A refinement 
of this general approach is to predict the current sample based on the previous p samples. To 
be specific, let s (/?) denote the current sample of speech and let s(n) denote the predicted value 
of s(n) defined as 

s(n) = io(J>(B-/) (9-7) 

/ = i 

Thus s (n) is a weighted linear combination of the past p samples and the a(i) are the predictor 
(filter) coefficients. The a(i) are selected to minimize some function of t 
and s(n). 

A mathematically and practically convenient error function is I 
With this as the performance index for the predictor, we select the a(i) to minimize 



% = le-(n) = I s(«)- X a(i)s(n-i)\ 

n = l ii = iL is I J 



% = rJO) - 2 f a(i)rji) +11 «(/>(/>,,(' -j) (9 - 8) 

;=i i=iy=i 

where r ss (m) is the autocorrelation function of the sampled signal sequence s(n ) defined as 

N 

r ss (m) = I d s(i)s(i + m) (9-9) 

Minimization of E p with respect to the predictor coefficient a{i) results in the set of linear 
equations, called the normal equations 

ia(i)rji-j) = rjjy, j=l,2,...,p (9-10) 

j = i 

Thus the values of the predictor coefficients are established. 

Having described the method for determining the predictor coefficients, let us now con- 
sider the block diagram of a practical DPCM system, shown in Figure 9-3. In this configuration, 
the predictor is implemented with the feedback loop around the quantizer. The input to the 
predictor is denoted as S(n ) which represents the signal sample s(n ) modified by the quantization 
process, and the output of the predictor is 
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Figure 9-3: (a) Block Diagram of a DPCM Encoder 
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Figure 9-3: (b) DPCM Decoder at the Receiver 



The difference 



s(n) = X a{i)s(n -i) 



e(n) = s(n) — s(n) 



(9-11) 



(9-12) 



is the input to the quantizer and e (n ) denotes the output. Each value of the quantized prediction 
error e(n) is encoded into a sequence of binary digits and transmitted over tie channel to the 
receiver. The quantized error e(n) is also added to the predicted value S (/?) to yield $(n). 

At the receiver the same predictor that was used at the transmitting end is synthesized and 
its output s(n) is added to e(n) to yield s(n). The signal s («) is the desired excitation for the 
predictor and also the desired output sequence from which the reconstruct ;d signal s(t) is 
obtained by filtering, as shown in Figure 9-3b. 

The use of feedback around the quantizer, as described above, ensures that the error in 
s(n) is simply the quantization error q(n) = e(n)-e(n) and that there is no accumulation of 
previous quantization errors in the implementation of the decoder. That is, 



q(n) = e(n)-e(n) 

= e{n)-s{n)-s(n) 



(9-13) 



= s{n)-s{n) 
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Hence s{n) = s{n) + q(n). This means that the quantized sample s (n ) differs from the input 
s (« ) by the quantization error q (n ) independent of the predictor used. Therefore the quantization 
errors do not accumulate. 

In the DPCM system illustrated in Figure 9-3, the estimate or predicted value s(n) of the 
signal sample is obtained by taking a linear combination of past values 

s(n -k),k = 1,2, ...,p, as indicated by (9-1 1). An improvement in the quality of the estimate 
is obtained by including linearly filtered past values of the quantized error. Specifically, the 
£ (n ) estimate may be expressed as 



(9-14) 



s(n) = X a(i)s(n Z b{i)e(n -/) 

where b{i) are the coefficients of the filter for the quantized error sequence e(n ). The block 
diagram of the encoder at the transmitter and the decoder at the receiver are shown in Figure 
9-4. The two sets of coefficients a(i) and b(i) are selected to minimize some function of the 
error e(n) = s(n)- s(n) such as the sum of squared errors. 
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By using a logarithmic compressor and a 4-bit quantizer for the error sequeno: e(n), 
DPCM results in high quality speech at a rate of 32,000 bps, which is a factor of two low ;r than 
PCM. 

9.3.2 Adaptive PCM and DPCM 

In general, the power in a speech signal varies slowly with time. PCM and DPCM enc oders, 
however, are designed on the basis that the speech signal power is constant and, hen;e, the 
quantizer is fixed. The efficiency and performance of these encoders can be improved by having 
them adapt to the slowly time-variant power level of the speech signal. 

In both PCM and DPCM, the quantization error q{n) resulting from a uniform quantizer 
operating on a slowly varying power level input signal will have a time-variant variance 
(quantization noise power). One improvement which reduces the dynamic range of the quan- 
tization noise is the use of an adaptive quantizer. 

Adaptive quantizers can be classified as feedforward or feedback. A feedforward adaptive 
quantizer adjusts its step size for each signal sample based on a measurement cf the input ; peech 
signal variance (power). For example, the estimated variance based as a slicing window esti- 
mator is 

= h Jt/ (k) ( -- 15) 

Then, the step size for the quantizer is 

A(n + 1) = A(,7)£„ + l (Si -16) 

In this case, it is necessary to transmit A(n + 1) to the decoder in order for it to reconstruct the 
signal. 

A feedback adaptive quantizer employs the output of the quantizer in the adjustment of 
the step size. In particular, we may set the step size as 

A(n + l) = a(w)A(n) (S - 17) 

where the scale factor a(n) depends on the previous quantizer output. For example, if the 
previous quantizer output is small, we may select a(n) < 1 in order to provice for finer quan- 
tization. On the other hand, if the quantizer output is large, then the step size should be increased 
to reduce the possibility of signal clipping. Such an algorithm has been successfully used in 
the encoding of speech signals. Figure 9-5 illustrates such a (3- bit) quantizer in which the step 
size is adjusted recursively according to the relation 

A(« + l) = A(n)-M(n) 

where M(n ) is a multiplication factor whose value depends on the quantizer lev el for the sample 
s{n) and A{n) is the step size of the quantizer for processing s{n). Values of ths multiplications 
factors optimized for speech encoding have been given by Jayant [18]. These values are dis- 
played in Table 9-1 for 2-, 3-, and 4-bit quantization, for PCM and DPCM. 
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Table 9-1 : Multiplication Factors for Adaptive Step Size Adjustment (Jayant [18]) 



In DPCM, the predictor can also be made adaptive. Thus, in ADPCM the coefficients of 
the predictor are changed periodically to reflect the changing signal statistics of the speech. The 
linear equations given by (9-10) still apply, but the short-term autocorrelation function of j(n), 
r ss {m) changes with time. 
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9.3.3 ADPCM Standard 

Figure 9-6 illustrates, in block diagram form, a 32,000 bps ADPCM encoder md decoder that 
has been adopted as an international (CCITT) standard for speech transmission over telep hone 
channels. The ADPCM encoder is designed to accept 8-bit PCM compressed signal samples 
at 64,000 bps and by means of adaptive prediction and adaptive 4-bit quantization reduces the 
bit rate over the channel to 32,000 bps. The ADPCM decoder accepts the 32,000 bps data si ream 
and reconstructs the signal in the form of an 8-bit compressed PCM at 64,000 bps. Thus, we 
have a configuration shown in Figure 9-7, where the ADPCM encoder/decoder is embedded 
into a PCM system. Although the ADPCM encoder/decoder could be used directly on the speech 
signal, the interface to the PCM system is necessary in practice in order to maintain compati bility 
with existing PCM systems that are widely used in the telephone network. 
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The ADPCM encoder accepts the 8-bit PCM compressed signal and expands it to a 14-bit 
per sample linear representation for processing. The predicted value is subtracted from this 
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Figure 9-7: ADPCM Interface to PCM System 



14-bit linear value to produce a difference signal sample that is fed to the quantizer. Adaptive 
quantization is performed on the difference signal to produce a 4-bit output for transmission 
over the channel. 

Both the encoder and decoder update their internal variables based only on the ADPCM 
values that are generated. Consequently, an ADPCM decoder including an inverse adaptive 
quantizer is embedded in the encoder so that all internal variables are updated based on the same 
data. This ensures that the encoder and decoder operate on synchronism without the need to 
transmit any information on the values of internal variables. 

The adaptive predictor computes a weighted average of the last six dequant ized difference 
values and the last two predicted values. Hence, this predictor is basically a two-pole (p = 2) 
and six-zero (m =6) filter governed by the difference equation given by (9-14). The filter 
coefficients are updated adaptively for every new input sample. 

At the receiving decoder and at the decoder that is embedded in the encoder, the 4-bit 
transmitted ADPCM value is used to update the inverse adaptive quantizer whose output is a 
dequantized version of the difference signal. This dequantized value is added to the value 
generated by the adaptive predictor to produce the reconstructed speech sample This signal is 
the output of the decoder which is converted to compressed PCM format at the receiver. 

ADPCM Experiment 

The objective of this experiment is to gain familiarity and understanding of ADPCM and 
its interface with a PCM encoder/decoder (transcoder). As described above the ADPCM 
transcoder is inserted between the PCM compressor and the PCM expander, as shown in 
Figure 9-7. This is the configuration of the software PCM and DPCM modules for this 
experiment. 
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The input to the PCM-ADPCM transcoder system can be supplied either from internally 
generated waveform data files, or from a microphone input, or from an external waveform 
generator, just as in the case of the PCM experiment. The output of the transcoc er can 
be monitored and displayed from one of the DAC output ports. Comparisons shculd be 
made between the output signal from the PCM-ADPCM transcoder with the sig 
the PCM trancoder (PCM Experiment), and with the original input signal. 



9.4 DELTA MODULATION (DM) 

Delta modulation may be viewed as a simplified form of DPCM in which a two- level (1-bit) 
quantizer is used in conjunction with a fixed first-order predictor. The block diagram of a DM 
encoder-decoder is shown in Figure 9-8a. We note that 



Since 



It follows that 



i(n) = s(n-\) = 5{n-\) + e(n-\) 
q(n) = e{n)-e(n) = i(n)- [s(n)- f(n)] 
s(n ) = s(n - \ ) + q(n - I) 



(9-18) 



(9-19) 



Thus the estimated (predicted) value of s(n) is really the previous sample s(n - 1) modified by 
the quantization noise q(n - 1). We also note that the difference equation in (9-18) reprssents 
an integrator with an input e(n). Hence an equivalent realization of the one-step predictor is an 
accumulator with an input equal to the quantized error signal e(n). In general, the quantized 
error signal is scaled by some value, say A,, which is called the step size. This equivalent 
realization is illustrated in Figure 9-8b. In effect, the encoder shown in Figure 9 -8b approxi mates 
a waveform s(t) by a linear staircase function. In order for the approximation to be rektively 
good, the waveform s(t) must change slowly relative to the sampling rate. This requir;ment 
implies that the sampling rate must be several (a factor of at least 5) times the Nyquist rate. A 
lowpass filter is usually incorporated into the decoder to s 
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Figure 9-8: (a) Block Diagram of a Delta Modulation System 
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Figure 9-8: (b) An Equivalent Realization of a Delta Modulation System 
9.4.1 Adaptive Delta Modulation (ADM) 

At any given sampling rate, the performance of the DM encoder is limited by the two types of 
distortion. One is called slope-overload distortion. It is due to the use of a step size A, that is 
too small to follow portions of the waveform that have a steep slope. The second type of 
distortion, called granular noise, results from using a step size that is too large in parts of the 
waveform having a small slope. The need to minimize both of these two types of distortion 
results in conflicting requirements in the selection of the step size A,. 

An alternative solution is to employ a variable size that adapts itself to he short-term 
characteristics of the source signal. That is, the step size is increased when the waveform has 
a steep slope and decreased when the waveform has a relatively small slope. 

A variety of methods can be used to set adaptively the step size in every iteration. The 
quantized error sequence e(n) provides a good indication of the slope characteristics of the 
waveform being encoded. When the quantized error e (n ) is changing signs between successive 
iterations, this is an indication that the slope of the waveform in the locality is relatively small. 
On the other hand, when the waveform has a steep slope, successive values of the error e(n) 
are expected to have identical signs. From these observations it is possible to devise algorithms 
which decrease or increase the step size depending on successive values of e(n). A relatively 
simple rule devised by Jayant [17] is to vary adaptively the step size according lo the relation 

A(«) = A(n-\)K' ,l ' m "- l \ n = l,2,... (9-20) 

where K > 1 is a constant that is selected to minimize the total distortion. A block diagram of 
a DM encoder-decoder that incorporates this adaptive algorithm is illustrated in Figure 9-9. 

Several other variations of adaptive DM encoding have been investigated and described 
in the technical literature. A particularly effective and popular technique first proposed by 
Greefkes [15] is called continuously variable slope delta modulation (CVSD). In CVSD the 
adaptive step-size parameter may be expressed as 
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A(n) = ct^Vi-l)**, 
if e(n - 1), and e(n-2) have the same sign; otherwise 

A(n) = ocA(n-1) + A- 2 



Output 
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(9-21) 



(9-22) 



The parameters a, k :, , and A 2 are selected such that < a < 1 and k :, > A 2 > 0. For more discu ssion 
on this and other variations of adaptive DM, the interested reader is referred to the papers by 
Jayant [18] and Flanagan et al. [13] and to the extensive references contained in these papers. 

DM and ADM Experiment 

The purpose of this experiment is to gain an understanding of delta modulation and 
adaptive delta modulation for coding of waveforms. This experiment involves writing 
software modules for the DM encoder and decoder as shown in Figure 9-8, and fcr the 
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ADM encoder and decoder shown in Figure 9-9. The lowpass filter at the decoder can be 
implemented as a linear-phase FIR filter. For example, a Hanning filter which has the 
impulse response 



h(n) = 



1 - cos 



2nn 
N-l 



0<n <N-l 



(9.23) 



may be used, where the length N may be selected in the range 5 < N < 15. 

The input to the DM and ADM systems can be supplied either from internally generated 
waveform data files, or from an external waveform generator, or from a microphone input, 
just as in the case of the PCM Experiment. The output of the decoder can be monitored 
and displayed from one of the DAC output ports. Comparisons should be made between 
the output signal from the DM and ADM decoders and the on 

9.5 LINEAR PREDICTIVE CODING (LPC) OF SPEECH 

The linear predictive coding (LPC) method for speech analysis and synthesis is based on 
modeling the vocal tract as a linear all-pole (IIR) filter having the system function 



H{z) 



1 + I a„{k)z- 



(9-24) 



where p is the number of poles, G is the filter gain, and{a p (^)} are the parameters that determine 
the poles. There are two mutually exclusive excitation functions to model voiced and unvoiced 
speech sounds. On a short-time basis, voiced speech is periodic with a fundamental frequency 
F , or a pitch period UF m which depends on the speaker. Thus voiced speech is generated by 
exciting the all-pole filter model by a periodic impulse train with a period equal to the desired 
pitch period. Unvoiced speech sounds are generated by exciting the all-pole filter model by the 
output of a random- noise generator. This model is shown in Figure 9-10. 
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Figure 9-10: Block Diagram Model for the Generation of a Speech Signal 
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Given a short-time segment of a speech signal, usually about 20 ms or 160 sample s at an 
8kHz sampling rate, the speech encoder at the transmitter must determine the proper excitation 
function, the pitch period for voiced speech, the gain parameter G, and the coefficients a p (k). 
A block diagram that illustrates the speech encoding system is given in Figure 9-11. The 
parameters of the model are determined adaptively from the data and encoded into a binary 
sequence and transmitted to the receiver. At the receiver, the speech signal is synthesize i from 
the model and the excitation signal. 
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Figure 9-1 1 : Encoder and Decoder for LPC 



The parameters of the all-pole filter model are easily determined from the speech sariples 
by means of linear prediction. To be specific, the output of the FIR filter is 



s(n) = -I a(k)s(n-k) (9-25) 
and the corresponding error between the observed sample s(n) and the estimaie s(n) is 



e(n) = s(n)+ I aJk)s(n-k) 



(9-26) 



By minimizing the sum of squared errors, i.e. 
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£ = le\n) = l\s(n)+ t a p (k)s(n - k)\ (9-27) 

we can determine the pole parameters {a p (k)} of the model. The result of differentiating T. with 
respect to each of the parameters and equating the result to zero, is a set of p linear equations 



I a„(k)r ss (m-k) = -rjrn) , m = 1,2, 



(9-28) 



where r ss (m) is the autocorrelation of the sequence s (n ) defined as 



r ss (m) = I s(n)s(n +m) 



(9-29) 



These equations can be solved recursively and most efficiently, without resorting to matrix 
inversion, by using the Levinson-Durbin algorithm (for reference see Levinsori [20], Durbin 
[12], or Proakis and Manolakis [8]). These recursive equations are 



r ss (m)+ I a„_ i (k)r ss (m-k) 



a,„(jn) = K m = — 



,m = \,2,...,p 



a m {k) = a m ^{k)+K m a m _,(m-k) 



= a m -\{k) + a m {m)a m _ { {m-k) 



(9-30) 

k = 1,2, ...,m- 1 
m = 1,2,...,/? 

£„, = £ m _,(l-tf;) , m = \,2,-..,P ; ^ = rJ0) 

where {K„,} are the reflection coefficients in the equivalent lattice filter. The prediction coef- 
ficients in the all pole model are {a p (k)} and the residual prediction squared error is 

The gain parameter of the filter can be obtained by noting that its input- output equation 

is 



= -jiaJk)s(n-k) + Gx(n) 



(9-31) 



where x(n) is the input sequence. Clearly, 

G.y(k) = s(n)+ t a p (k)s(n -k) = e{n) 

Then 

N-l N-l 

G 2 X x\n) = S e\n) 
If the input excitation is normalized to unit energy be design, then 



(9 - 32) 



G 2 = EV(rt) = r ss (0)+ia p (k)r ss (k) 



(9.33) 



Thus, G 2 is set equal to the residual energy resulting from the least squares optimization. 

Once the LPC coefficients are computed, we can determine whether the input speech 
frame is voiced, and if so, what the pitch is. This is accomplished by computing the sec uence 

r,(n) = lr a (k)r ss (n-k) (9-34) 
* = i 

where r a (k) is defined as 

r a {k) = ta p (i}a p (i + k) (S -35) 

i = i 

which is the autocorrelation sequence of the prediction coefficients. The pitch is detec: ed by 
finding the peak of the normalized sequence r e (n)/r e (0) in the time interval that corresponds to 
3 to 15 ms in the 20 ms sampling frame. If the value of this peak is at least 0.25, the frame of 
speech is considered voiced with a pitch period equal to the value of n = N p , where r r (N p \/r e (0) 
is a maximum. If the peak value is less than 0.25, the frame of speech is considered unvoiced 
and the pitch is zero. 

The values of the LPC coefficients, the pitch period and the type of excitation are trans- 
mitted to the receiver where the decoder synthesizes the speech signal by passing the proper 
excitation through the all-pole filter model of the vocal tract. Typically, the pitch period requires 
6 bits and the gain parameter may be represented by 5 bits after its dynamic ran^e is compressed 
logarithmically. If the prediction coefficients were to be coded, they would require between 8 
to 10 bits per coefficients for accurate representation. The reason for such high accuracy is that 
relatively small changes in the prediction coefficients result in a large change in the pole pos tions 
of the filter model. The accuracy requirements are lessened by transmitting the reflection 
coefficients, {K,} which have a smaller dynamic range, i.e. | AT, |< 1 . These are adeqi ately 
represented by 6 bits per coefficient. Thus, for a lOth-order predictor the total number of bits 
assigned to the model parameters per frame is 72. If the model parameters ar ; changed every 
20 milliseconds, the resulting bit rate is 3,600 bps. Since the reflection coefficients are usually 
transmitted to the receiver, the synthesis filter at the receiver is implemented as an all-pole lattice 
filter, described in Chapter 7. 

LPC Experiment 

The objective of this experiment is to synthesize a speech signal that has been processed 
through an LPC coder. The decoder that performs the synthesis is an all-pole lattice w hose 
parameters are the reflection coefficients that have been pre-computed by the LPC speech 
analyzer. The additional parameters required are the gain G , the type for excitation and 
if the excitation is a periodic impulse train for voiced speech, we also need he pitch pe riod. 
The output of this experiment is a speech signal that can be compared with the original 
speech signal. The distortion effects due *o LPC analysis/synthesis may be assessed 
qualitatively. 
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9.6 DUAL-TONE MULTIFREQUENCY (DTMF) SIGNALS 

DTMF is the generic name for push-button telephone signaling that is equivalent to the Touch 
Tone system in use within the Bell System. DTMF also finds widespread use in electronic mail 
systems and telephone banking systems in which the user can select options from a menu by 
sending DTMF signals from a telephone. 

In a DTMF signaling system a combination of a high frequency tone and a low frequency 
tone represent a specific digit or the characters * and #. There are eight frequencies which are 
arranged as shown in Figure 9-12, to accommodate a total of 16 ch 
assigned as shown while the other four are reserved for future use. 
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DTMF digit = Row Tone + Column Tone 

Figure 9-12: DTMF Digits 



DTMF signals are easily generated in software on a microcomputer and delected by means 
of digital filters, also implemented in software, that are tuned to the eight frequency tones. 
Usually, DTMF signals are interfaced to the analog world via a codec (coder/decoder) chip or 
by linear A/D and D/A converters. Codec chips contain all the necessary A/D and D/A, sampling, 
and filtering circuitry for a bi-directional analog/digital inte^" 



ace. 

The ADSP-2101 has been programmed to read DTMF digits stored in data memory in a 
relocatable look-up list. Alternatively, a DTMF keypad could be used for digit entry. In either 
case, the resultant DTMF tones may be generated either mathematically or from a look-up table. 
In the ADSP-2101, digital samples of two sine waves are generated mathematically, scaled, 
and added together. The sum is logarithmically compressed and sent to the codec for conversion 
to an analog signal. At an 8kHz sampling rate, the ADSP-2101 must output a sample every 
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125 ms. In this case, a sine look-up table is not used because the values of the sinewave c m be 
computed quickly without using the large amount of data memory that a table look-up would 
require. 

At the receiving end, the ADSP-2101 reads the logarithmically compressed, 8- bit digital 
data words from the codec, logarithmically expands each 8-bit sample to its 16-bit linear format 
and then detects the tones to decide on the transmitted digit. The detection algorithm car be a 
DFT implementation using the FFT algorithm or a filter bank implementation. For the relat ively 
small number of tones to be detected, the filter bank implementation is more efficient. Below, 
we describe the use of the Goertzel algorithm to implement the eight tuned filters. 

Recall from the discussion in Chapter 8 that the DFT of an /V-point data s equence {r(n)} 

is 

X(k) = t x{n)W* , k=Q,\,...,N -\ (9-36) 

If the FFT algorithm is used to perform the computation of the DFT, the number of computations 
(complex multiplications and additions) is N log 2 N. In this case, we obtain all N values c f the 
DFT at once. However, if we desire to compute only M points of the DFT, where M < log 2 A f , 
then a direct computation of the DFT is more efficient. The Goertzel algorithm, whi;h is 
described below, is basically a linear filtering approach to the computation of the DFT and 
provides an alternative to direct computation. 

9.6. 1 The Goertzel Algorithm 

The Goertzel algorithm exploits the periodicity of the phase factors {W^} and allows us 

to express the computation of the DFT as a linear filtering operation. Since W^ kN = 1, we can 
multiply the DFT by this factor. Thus 

X(k) = W N kN x(m)Wf*- mi (9-37) 

m = 

We note that (9-37) is in the form of a convolution. Indeed, if we define the sequence ^.(,7) as 

%(n) = x{m)W- N H '- m) (9-38) 

then it is clear that y k (n ) is the convolution of the finite-duration input sequenc ; x (n ) of le ngth 
N with a filter that has an impulse response 

h k (n) = W- k "u(n) (9-39) 

The output of this filter at n = N yields the value of the DFT at the frequency co^ = 2nk/N. That 
is, 

= ^(«)l„ = „ (9-40) 
as can be verified by comparing (9-37) with (9-38). 
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The filter with impulse response h k (n) has the system function 



H t (z) = l —— t (9-41) 



This filter has a pole on the unit circle at the frequency co t = Ink IN . Thus the entire DFT can 

be computed by passing the block of input data into a parallel bank of N single-pole filters 
(resonators), where each filter has a pole at the corresponding frequency of the DFT. 

Instead of performing the computation of the DFT as in (9-38), via convolution, we can 
use the difference equation corresponding to the filter given by (9-41) to compute y k (n) 
recursively. Thus we have 



y k {n) = W~ k y k (n-\)+x(n) , y k (-l) = (9-42) 

The desired output is X(k) = y k (N). To perform this computation, we can compute once and 
store the phase factor^. 

The complex multiplications and additions inherent in (9-42) can be avoided by combining 
the pairs of resonators possessing complex-conjugate poles. This leads to two-pole filters with 
system functions of the form 

H k (z) = , r (9-43) 

l-2cos(aMiA7A r )z- , +z- 2 

The realization of the system illustrated in Figure 9-13 is described by the differe nce equations 

2%k 

v k (n) = 2cos — v k (n-l)-v k (n-2)+x(n) (9-44) 

y k (n) = v k (n)-W*v k (n-\) (9-45) 

with initial conditions v t (— 1) = v k (-2) = 0. This is the Goertzel algorithm. 

The recursive relations in (9-44) is iterated for n = 0, 1, ...,N , but the equation in (9-45) 
is computed only once at a time n = N. Each iteration requires one real multiplication and two 
additions. Consequently, for a real input sequence x(n), this algorithm requires N + 1 real 
multiplications to yield not only X(k) but also, due to symmetry, the value of X(N -k). 

We can now implement the DTMF decoder by use of the Goertzel algorithm. Since there 
are eight possible tones to be detected, we require eight filters of the type given by (9-43), with 
each filter tuned to one of the eight frequencies. In the DTMF detector, there is no need to 
compute the complex value X(k); only the magnitude \X(k) | or the magnitude square value 
| X(k) \ 2 will suffice. Consequently, the final step in the computation of the DFT value involving 
the numerator term (feedforward part of the filter computation) can be simplified. In particular, 
we have 
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Figure 9-13: Realization of Two-Pole Resonator for Computing the DFT 
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Thus, complex-valued aritl 
DTMF Experiment 



: operations are completely eliminated in the DTMF deiector. 



The objective of this experiment is to gain an understanding of the DTMF tone gene ration 
software and the DTMF decoding algorithm (the Goertzel algorithm). Ir this experiment, 
a dialing sequence of several digits may be stored in data memory, EIAL-LIST. The 
DTMF digits are stored in the four least significant bits of the 1 6-bit data word as fo) lows: 



DTMF Digit Hex Word DTMF Digit Hex Word 

1 0x0000 7 0x8000 

2 0x1000 8 0x9000 

3 0x2000 9 OxAOOO 
A 0x3000 C OxBOOO 

4 0x4000 * OxCOOO 

5 0x5000 OxDOOO 

6 0x6000 # OxEOOO 
B 0x7000 D OxFOOO 



Delir 

op dia 
Redii 

Quiet space x F x 



Stop dialing OxFxxx 
Redial OxOFxx 
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A simple finite state machine is implemented in which the IRQ2 switch is used to 
advance the state. The states are as follows: 

State No digits are generated 

State 1 A continuous dial tone (350 Hz + 440 Hz) is generated 

State 2 DTMF Dialing Occurs 

When a Stop Dialing Delimiter is read in the DIAL-LIST, the machine jumps back to 
State 0. 

A single-channel DTMF decoder has been implemented in software for t lis experiment. 
The valid decoded DTMF digits are sent to the DAC port for display. The relationship 
between the received DTMF digits and the hexadecimal code output is as follows: 
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In addition to exercising the program described above, we suggest that the student use a 
spectrum analyzer to observe the frequency components of the generated tones. 



9.7 BINARY DIGITAL COMMUNICATIONS 

Digitized speech signals that have been encoded via PCM, ADPCM, DM and LPC are usually 
transmitted to the decoder by means of digital modulation. A binary digital communications 
system employs two signal waveforms, say s,(t) = s(t) and s 2 (t) = - s(t) to transmit the binary 
sequence representing the speech signal. The signal waveform s(t), which is nonzero over the 
interval < t < T, is transmitted to the receiver if the data bit is a 1 and the signal waveform 
-s(t), < t < T is transmitted if the data bit is a 0. The time interval T is called the signal interval 
and the bit rate over the channel is R = 1IT bits per second. A typical signal waveform s(t) is 
a rectangular pulse, i.e., s(t) = A, < t < T, which has energy A 2 T. 

In practice, the signal waveforms transmitted over the channel are corrupted by additive 
noise and other types of channel distortions that ultimately limit the performance of the com- 
munications system. As a measure of performance we normally use the average probability of 
error, which is often called the bit error rate. 

Experiment on Binary Data Communications System 

The purpose of this experiment is to investigate the performance of a binary data com- 
munications system on an additive noise channel by means of simulation. The basic 
configuration of the system to be simulated is shown in Fig. 9-14. Five software modules 
are required. 
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Figure 9-14: Model of Binary Data Communications System 



1 . A binary data generator module that generates a sequence of independent binary digits 
with equal probability. 

2. A modulator module that maps a binary digit 1 into a sequence of M consecutive +1 's, 
and maps a binary digit into a sequence of M consecutive -l's. Thus, the M con- 
secutive + l's represent a sampled version of the rectangular pulse. 

3. A noise generator that generates a sequence of uniformly distributed numbers over 
the interval (-a, a). Each noise sample is added to a corresponding signal sample. 

4. A demodulator module that sums the M successive outputs of the noise corrupted 
sequence + l's or - l's received from the channel. We assume that he demodulator 
is time synchronized so that it knows the beginning and end of each waveform. 

5. A detector and error counting module. The detector compares the output cf the 
modulator with zero and decides in favor of 1 if the output is greater than zero and in 
favor of if the output is less than zero. If the output of the detector does not tgree 
with the transmitted bit from the transmitter, an error is counted by the counter. The 
error rate depends on the ratio of the size of M to the additive noise power, wh ich is 
P„ = a 2 1 12. 

A Flowchart of the simulation program is illustrated in Fig. 9-15. Thi; measured error 
rate can be displayed for different signal-to-noise ratios, either by changing M and keeping 
P„ fixed or vice versa. 



9.8 SPREAD SPECTRUM COMMUNICATIONS 

Spread spectrum signals are often used in the transmission of digital data ovei communk ation 
channels that are corrupted by interference due to intentional jamming or from other users of 
the channel. In applications other than communications, spread spectrum signals are u; ed to 
obtain accurate range (time delay) and range rate (velocity) measurements ir radar and navi- 
gation. For the sake of brevity, we shall limit our discussion to the use of spread spectrum for 
digital communications. Such signals have the characteristic that their bar dwidth is much 
greater than the information rate in bits per second. 
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Figure 9-15: Flowchart for Simulating a Binary Data Communications System 



In combatting intentional interference (jamming), it is important to the communicators 
that the jammer who is trying to disrupt their communication does not have prior knowledge 
of the signal characteristics. To accomplish this, the transmitter introduces an element of 
unpredictability or randomness (pseudo-randomness) in each of the possible transmitted signal 
waveforms, which is known to the intended receiver, but not to the jammer. As a consequence, 
the jammer must transmit an interfering signal without knowledge of the pseudo-random 
characteristics of the desired signal. 
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Interference from other users arises in multiple-access communications systems in which 
the number of users share a common communications channel. At any given time, a subset of 
these users may transmit information simultaneously over a common channe to corresp Dnding 
receivers. The transmitted signals in this common channel may be distinguished frcm one 
another by superimposing a different pseudo-random pattern, called a multiple-access code, in 
each transmitted signal. Thus, a particular receiver can recover the transmi:ted data intended 
for it by knowing the pseudo-random pattern, i.e., the key used by the corresponding transmitter. 
This type of communication technique, which allows multiple users to simultaneous!;' use a 
common channel for data transmission is called code division multiple access (CDMA). 

The block diagram shown in Fig. 9-16 illustrates the basic elements of a spread spectrum 
digital communications system. It differs from a conventional digital comm jnications system 
by the inclusion of two identical pseudo-random pattern generators, one which interfaces with 
the modulator at the transmitting end and the second which interfaces with the demodulator at 
the receiving end. The generators generate a pseudo-random or pseudo-noise (PN) binary- 
valued sequence (± l's) which is impressed on the transmitted signal at the modulatDr and 
removed from the received signal at the demodulator. 
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Figure 9-16: Basic Spread Spectrum Digital Communications System 



Synchronization of the PN sequence generated at the demodulator with the PN sec uence 
contained in the incoming received signal is required in order to demodulate the received .' ignal. 
Initially, prior to the transmission of data, synchronization is achieved by transmitting i. short 
fixed PN sequence to the receiver for purposes of establishing synchronization. After time 
synchronization of the PN generators is established, the transmission of data commences. 

Experiment on Binary Spread Spectrum Communications 

The objective of this experiment is to demonstrate the effectiveness of a PN spread 
spectrum signal in suppressing sinusoidal interference. Let us consider the binary com- 
munication system described in the experiment of Section 9-7, and let us multiply the 
output of the modulator by a binary (±1) PN sequence. The same binary PN sequence is 
used to multiply the input to the demodulator and, thus, to remove the effect of the PN 
sequence in the desired signal. The channel corrupts the transmitted signal by the addition 
of a wideband noise sequence { w(n) } and a sinusoidal interference sequence of the form 
i(n) = A sin co rc, where < C0 < n. We may assume that A>M, where M is the number 
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of samples per bit from the modulator. The basic binary spread spectrum system is shown 
in Fig. 9-17. As can be observed, this is just the binary digital communication system 
shown in Fig. 9-14, to which we have added the sinusoidal interference and the PN 
sequence generators. 



Binary 
Data 
Generator 



± 1 

— T— 



Modulator 



s(n) 



P(n) 



PN 
Sequence 
Generator 



Sine 
Generator 



i(n) 



*© K+> 



w(n) 



Noise 
Generator 



CHANNEL 



Output 


Detector 
and 
Error 
Count 










Demodulator 




•« 


■« 


* ( 



P(n) 



Sequence 
Generator 



Figure 9-17: Block Diagram of Binary PN Spread Spectrum System for Simulation Experiment 

The PN sequence may be generated by using a random number generator to 
generate a sequence of equal-probable ± 1 's. 

Draw a flowchart and simulate the spread-spectrum system shown in Fig. 9-17. Run 
the simulated system with and without the use of the PN sequence and measure the error 
rate under the condition that A > M for different values of M, e.g., M = 50, 100, 500, 1 000. 
Explain the effect of the PN sequence on the sinusoidal interference signal . Thus explain 
why the PN spread spectrum system outperforms the conventional binary c ommunication 
system in the presence of the sinusoidal jamming signal. 



9.9 SUMMARY 

In the chapter we focused on applications of the ADSP-2101 to waveform representation and 
coding. In particular, we described several methods for digitizing analog waveform, including 
PCM, DPCM, ADPCM, DM, ADM, and LPC. These methods have been widely used for speech 
coding and transmission. Experiments involving these waveform encoding methods were 
formulated for implementation on the ADSP-2100 family of microcomputers. 

We also described signal detection and communication systems where the ADS?-2101 may be 
used to perform the signal processing tasks. Experiments were also devised for these appli- 
cations. 
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ADAPTIVE FILTERS 
AND THEIR APPLICATIONS 



10.1 INTRODUCTION 

In Chapters 6 and 7 we described methods for designing FIR and IIR digital filters to satisfy 
some desired specifications. Our goal was to determine the coefficients of the digital filter that 
met the desired specifications. 

In contrast to the filter design techniques considered in these two chapters, there are many 
digital signal processing applications in which the filter coefficients cannot be specified a priori. 
For example, let us consider a high-speed modem that is designed to transmit data over telephone 
channels. Such a modem employs a channel equalizer to compensate for the channel disto lion. 
The modem must effectively transmit data through communication channels that have different 
frequency response characteristics and hence result in different distortion effects. The only way 
in which this is possible is if the channel equalizer has adjustable coefficients that can be 
optimized to minimize some measure of the distortion, on the basis of measurements performed 
on the characteristics of the channel. Such a filter with adjustable parameters is called an 
adaptive filter, in this case an adaptive equalizer. 

Numerous applications of adaptive filters have been described in the literature. Some of 
the more noteworthy applications include ( 1 ) adaptive antenna systems in which adaptive 1 ilters 
are used for beam steering and for providing nulls in the beam pattern to remove unde sired 
interference (for a reference, see the paper by Widrow et al. [22]); (2) digital communication 
receivers in which adaptive filters are used to provide equalization of intersyrr bol interfe -ence 
and for channel identification (for reference see Proakis [21]); (3) adaptive noise canceling 
techniques in which an adaptive filter is used to estimate and eliminate a noise component in 
some desired signal (for reference, see papers by Widrow et al. [23], Hsu and Giordano [16], 
and Ketchum and Proakis [19]); (4) system modeling, in which an adaptive filter is used as a 
model to estimate the characteristics of an unknown system. These are just i. few of the best 
known examples on the use of adaptive filters. 
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Although both IIR and FIR filters have been considered for adaptive fi tering, the FIR 
filter is by far the most practical and widely used. The reason for this preference is quite simple. 
The FIR filter has only adjustable zeros and hence it is free of stability problems associated 
with adaptive IIR filters that have adjustable poles as well as zeros. We should not conclude, 
however, that adaptive FIR filters are always stable. On the contrary, the stability of the filter 
depends critically on the algorithm for adjusting its coefficients. 

Of the various FIR filter structures that we may use, the direct form and :he lattice form 
are the ones often used in adaptive filtering applications. The direct-form FIR filter structure 
with adjustable coefficients h(0),h(\),...,h(N - 1) is illustrated in Figure 10-1. On the other 
hand, the adjustable parameters in an FIR lattice structure are the reflection co< 

Input 




Figure 10-1: Direct-Form Adaptive FIR Filter 



An important consideration in the use of an adaptive filter is the criterion for optimizing 
the adjustable filter parameters. The criterion must not only provide a meaning 
filter performance, but it must also result in a practically realizable algorithm. 

One criterion that provides a good measure of performance in adaptive filtering appli- 
cations is the least-squares criterion, and its counterpart in a statistical formulation of the 
problem, namely, the mean-square -error (MSE) criterion. The least squares (and MSE) criterion 
results in a quadratic performance index as a function of the filter coefficients and hence it 
possesses a single minimum. The resulting algorithms for adjusting the coefficients of the filter 
are relatively easy to implement. 

In this chapter, we describe a basic algorithm, called the least-mean- square (LMS) 
algorithm to adaptively adjust the coefficients of an FIR filter. The adaptive filter structure that 
will be implemented is the direct-form FIR filter structure with adjustable coefficients 
h(0),h(l),...,h(N - 1), as illustrated in Figure 10-1. After we describe the LMS algorithm, we 
apply it to several practical systems in which adaptive filters are employed. 
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10.2 LMS ALGORITHM FOR COEFFICIENT 
ADJUSTMENT 

Suppose we have an FIR filter with adjustable coefficients {h(k),0<k <N - 1}. Let {x(n)} 
denote the input sequence to the filter and let the corresponding output be {y(n)}, where 

y(n) = "l h{k)x{n-k) (13-1) 

Suppose that we also have a desired sequence {d(n)} with which we can compare the FIR filter 
output. Then, we can form the error sequence {e(n)} by taking the difference betweer d(n) 
and y(n). That is 

<?(«) = d{n)-y{n) (10-2) 

The coefficients of the FIR filter will be selected to minimize the sum of squared errors. 
Thus, we have 



M M 

£ = I e\n) = I 



N- I 

d(n)- I h{k)x{n-k) 

»=(> /; = OL A=0 



M N-\ N - I N - I 

= I d\n)-2 I h(k)r dx (k)+ I I h(k)h(l)r a .(k-l) (10-3) 



where, by definition, 

r dx (k) = ld(n)x(n-k) , 0<k<N-\ (10-4) 



M 

r n .{k) = lx(n)x(n+k) , 0<k<N-\ (10-5) 

n =0 



We call {'',/,(£)} the cross-correlation between the desired output sequence {d(n)} and the input 
sequence {x(n)} and {/'„(£)} is the auto-correlation sequence of {x(n)}. 

The sum of squared errors £ is a quadratic function of the FIR filter coefficients. Con- 
sequently, the minimization of £ with respect to the filter coefficients {h(k)} results in a set of 
linear equations. By differentiating £ with respect to each of the filter coefficients we obtain 

= , m =0,1,...,W-1 (10-6) 



dh{m) 
and, hence, 

N l h(k)rjk-m) = rjjn) , m = 0, 1, .. .,N - 1 (10-7) 



This is the set of linear equations which yield the optimum filter coefficients. 
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In order to solve the set of linear equations directly, we must first compute the auto- 
correlation sequence {/ „(£)} of the input signal and the cross-correlation sequence {r dx (k)} 
between the desired sequence {d(n)} and the input sequence {.*(«)}• 

The LMS algorithm provides an alternative computational method for determining the 
optimum filter coefficients {h(k)} without explicitly computing the correlation sequences 
{r sx (k)} and {r dx (k)} . The algorithm is basically a recursive gradient (steepest-descent) method 
that finds the minimum of £ and, thus, yields the set of optimum filter coefficienls. 

We begin with any arbitrary choice for the initial values of {h(k)}, say ]h (k)}. For 
example, we may begin with h Q (k ) = 0, < k < N . Then, after each new input sample x (n ) enters 
the adaptive FIR filter, we compute the corresponding output, say y{n), form the error signal 
e{n) = d(n) -y(n) and update the filter coefficients according to the equation 

h n {k) = h„^(k) + Ae(n)x(n -k) , A- = 0, 1, . ..,7V - 1 , « = 1,2,... (10-8) 

where A is called the step size parameter and x(n - k) is the sample of the input signal located 
at the kth tap of the filter at time n . This is the LMS recursive algorithm for adjusling the filter 
coefficients adaptively so as to minimize the sum of squared errors £ 

The step size parameter A controls the rate of convergence of the algorithm to the optimum 
solution. A large value of A leads to large step size adjustments and, thus, to rapid c onvergence, 
while a small value of A results in slower convergence. However, if A is made too large the 
algorithm becomes unstable. To ensure stability A must be chosen to be in range 

0<A< K>k (10 - 9) 

where N is the length of the adaptive FIR filter and P x is the power in the input signal, which 
can be approximated by 



1 * , r„.(0) 
P v - Ti—7lAn) = t^t (10-10) 
M + l„ = o M+ 1 

The mathematical justification of the equations (10-9) and (10-10) and the proof that the 
LMS algorithm leads to the solution for the optimum filter coefficients is given in more advanced 
treatments of adaptive filters. The interested reader may refer to the books by Haykin [24] and 
Proakis [21]. 

Below, we apply the LMS algorithm to several practical applications involving adaptive 
filtering. 

10.3 SYSTEM IDENTIFICATION OR SYSTEM MODELING 

To formulate the problem, let us refer to Figure 10-2. We have an unknown linear system that 
we wish to identify. The unknown system may be an all- zero (FIR) system or a pole-zero (IIR) 
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system. The unknown system will be approximated (modeled) by an FIR filter of lergth N. 
Both the unknown system and the FIR model are connected in parallel and ire excited by the 
same input sequence {x(n)}. If {y(n)} denotes the output of the model and {d(n)} denotes the 
output of the unknown system, the error sequence is {e{n) = d(n) -y(n)}. l ~ we minimize the 
sum of squared errors, we obtain the same set of linear equations as in (10-'')- Therefore, the 
LMS algorithm given by (10-8) may be used to adapt the coefficients of the FIR model so that 
its output approximates the output of the unknown system. 

d(n) 



Noise 
Signal 
Generator 



w(n) 



Unknown 
System 



tl e(n) 




for LMS Algo'ithm 



Figure 10-2: Block Diagram of System Identification or System Modeling Problem 

Experiment in System Identification 

There are three basic software modules that are needed to perform this experiment. 

1 . A noise signal generator that generates a sequence of random numbers with zero mean 
value. For example, we may generate a sequence of uniformly distributed random 
numbers over the interval (-a, a). Such a sequence of uniformly distributed numbers 
has an average value of zero and a variance of a 2 /\2. This signal sequence, call it 
{x(n)} , will be used as the input to the unknown system and the adaptive FIR model. 
In this case, the input signal {*(«)} has power P x = a 2 /\2. 

2. An unknown system module which may be selected as an IIR filter and implemented 
by its difference equation. For example, we may select an IIR filte r specified by the 
(two-pole, two-zero) difference equation 

d(n) = a l d(n-l) + a 2 d(n-2)+x(n) + b i x(n-\) + b 1 x(n-2) (13- 11) 

where the choice parameters (a,,a 2 ) determine the positions of the poles and {b x ,b 2 ) 
determine the positions of the zeros of the filter. These parameters are input variables 
to the program. 
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3. An adaptive FIR filter module where the FIR filter has N tap coefficients that are 
adjusted by means of the LMS algorithm. The length N of the filter is an input variable 
to the program. 

The three modules are configured as shown in Figure 10-2. From this experiment we 
can determine how closely the impulse response of the FIR model approximates the 
impulse response of the unknown system after the LMS algorithm has converged. 

To monitor the convergence rate of the LMS algorithm, we may compute a short-term 
average of the squared error e 2 (n) and display it. That is, we may compute 

ASE(w) = ~ "l e\n) (10-12) 
K (=„ + i 

where m -nIK = 1,2, .... The averaging interval K may be selected in the range 
10 < K < 25. The effect of the choice of the step size parameter A on the convergence 
rate of the LMS algorithm may be observed by monitoring the ASE(ot). 

A Flowchart of the system identification program is shown in Figure 10-3. Besides the 
main part of the program, we have also included, as an aside, the computation of the 
impulse response of the unknown system, which can be obtained by exciting the system 
with a unit sample sequence 8(h). This actual impulse response can be compared with 
that of the FIR model after convergence of the LMS algorithm. The two impulse responses 
can be displayed for the purpose of comparison. 

10.4 SUPPRESSION OF NARROWBAND 

INTERFERENCE IN A WIDEBAND SIGNAL 

Let us assume that we have a signal sequence {x(n)} that consists of a desired wideband signal 
sequence, say {w(n)} corrupted by an additive narrowband interference sequence {s(n)} . The 
two sequences are uncorrected. This problem arises in digital communications and in signal 
detection, where the desired signal sequence {w(n)} is a spread-spectrum signal while the 
narrowband interference represents a signal from another user of the frequency band or some 
intentional interference from a jammer who is trying to disrupt the communication or detection 
system. 

From a filtering point of view, our objective is to design a filter that suppresses the nar- 
rowband interference. In effect, such a filter should place a notch in the frequency band occupied 
by the interference. In practice, however, the frequency band of the interference might be 
unknown. Moreover, the frequency band of the interference may vary slowly in time. 

The narrowband characteristics of the interference allow us to estimate s{n) from past 
samples of the sequence x(n) = s(n) + w{n) and to subtract the estimate from x(n). Since the 
bandwidth of {^(a!)} ls narrow compared to the bandwidth of {w(/j)}, the samples of {s(/!)} 
are highly correlated. On the other hand, the wideband sequence {w (n )} has a relai ively narrow 
correlation. 
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Figure 10-3: Flowchart of System Identification Program 



The general configuration of the interference suppression system is shov/n in Figure 10-4. 
The signal x(n) is delayed by D samples, where the delay D is chosen sufficiently large so that 
the wideband signal components vr(n) and w(n -D), which are contained in.v(«) andx(« -£>), 
respectively, are uncorrected. The output of the adaptive FIR filter is the es :imate 



s(n) = %_ h{k)x(n-k-D) 



(Hi-13) 
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e(n) = x(n) - s(n) 
Overall Interference Suppression Filter 



Desired 
Signal 



Figure 10-4: Adaptive Filter for Estimating and Suppressing a Narrowband Interference in a 
Wideband Signal 



The error signal that is used in optimizing the FIR filter coefficients is e (n ) = x (n ) - s (n ). The 
minimization of the sum of squared errors again leads to a set of linear equations for determining 
the optimum coefficients. Due to the delay D, the LMS algorithm for adjusting the coefficients 
recursively becomes 

k =0 1 N - 1 

h„(k) = h^m + Ae^Mn-k-D) , (10-14) 

n — i,z, ... 

Experiment on Suppression of Sinusoidal Interference 

There are three basic software modules required to perform this experiment. 

1 . A noise signal generator module that generates a wideband sequence {w (n )} of random 
numbers with zero mean value. In particular, we may generate a sequence of uniformly 
distributed random numbers as previously described in the experiment on system 
identification. The signal power is denoted as P„.. 

2. A sinusoidal signal generator module that generates a sine wave sequence 
s(n) = A sin(0„«, where < co < K and A is the signal amplitude. The power of the 
sinusoidal sequence is denoted as P,. 

3. An adaptive FIR filter module where the FIR filter has N tap coefficients that are 
adjusted by the LMS algorithm. The length N of the filter is an input variable to the 
program. 
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The three modules are configured as shown in Figure 1 0-5 . In this experiment the delay 
D = 1 is sufficient, since the sequence {w(n)} is a white noise (spectrally flat or uncor- 
related) sequence. The objective is to adapt the FIR filter coefficients and then to 
investigate the characteristics of the adaptive filter. 
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Figure 10-5: Configuration of Modules for Experiment on Interference Suppression 



It is interesting to select the interference signal to be much stronger than the d;sired 
signal w(n), for example, P s = 10P„,. Note that the power P x required in selecting the 
stepsize parameter in the LMS algorithm is P x =P S +P W . The frequency response char- 
acteristic H(a) of the adaptive FIR filter with coefficients {h (k)} should exhibit a res onant 
peak at the frequency of the interference. The frequency response of the interference 
suppression filter is //,(<») = 1 -//((O), which should then exhibit a notch at the frequency 
of the interference. 

A flowchart for this experiment is shown in Figure 10-6. It is interesting to display the 
sequences {w(n}, {s(n)}, and {x(n)}. It is also interesting to display the frequency 
responses H (co) and H s ((o) after the LMS algorithm has converged. The short-time a\ erage 
squared error ASE(m), defined by (10-12) may be used to monitor the convergence 
characteristics of the LMS algorithm. The effect of the length of the adaative filter 3n the 
quality of the estimate should be investigated. 

The experiment may be generalized by adding a second sinusoid of a different frequency. 
Then, //(co) should exhibit two resonant peaks, provided the frequencies are sufficiently 
separated. Investigate the effect of the filter length N on the resolution of two closely 
spaced sinusoids. 

10.5 ADAPTIVE LINE ENHANCEMENT 

In the preceding section we described a method for suppressing a strong narrowband interft rence 
from a wideband signal. An adaptive line enhancer (ALE) has the same cor figuration as the 
interference suppression filter in Figure 10-4, except that the objective is different. 
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Figure 10-6: Flowchart for Experiment on Suppression of Narrowband Interference 

In the adaptive line enhancer, {s(n)} is the desired signal and {w{n)\ represents a 
wideband noise component that masks {s (n )} . The desired signal {s (n )} may be a spectral line 
(a pure sinusoid) or a relatively narrowband signal. Usually, the power in the wideband signal 
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is greater than that in the narrowband signal, i.e., P w > P s . It is apparent that the ALL is a 
self-tuning filter that has a peak in its frequency response at the frequency of the input sinusoid 
or in the frequency band occupied by the narrowband signal. By having a narrow bandwidth 
FIR filter, the noise outside the frequency band of the signal is suppressed and, thus, the spectral 
line is enhanced in amplitude relative to the noise power in {w(n)}. 

Experiment on the ALE 

This experiment requires the same software modules as those used in the experiment on 
interference suppression. Hence, the description given in Section 10.4 applies directly. 
One change is that in the ALE, the condition is that P w > P s . Secondly, Ihe output s ignal 
from the ALE is {s(n)}. Repeat the experiment described in the previojs section ander 
these conditions. 



10.6 ADAPTIVE CHANNEL EQUALIZATION 

The speed of data transmission over telephone channels is usually limited by channel distortion 
that causes intersymbol interference (ISI). At data rates below 2400 bits the ISI is relatively 
small and is usually not a problem in the operation of a modem. However, at data rates above 
2400 bits, an adaptive equalizer is employed in the modem to compensate for the channel 
distortion and, thus, to allow for highly reliable high speed data transmission. In telephone 
channels, filters are used throughout the system to separate signals in different frequency bands. 
These filters cause amplitude and phase distortion. The adaptive equalizer is basically an 
adaptive FIR filter with coefficients that are adjusted by means of the LMS algorithm to correct 
for the channel distortion. 



correc 
±anne 



A block diagram showing the basic elements of a modem transmitting da a over a channel 
is given in Figure 10-7. Initially, the equalizer coefficients are adjusted by transmitting a short 
training sequence, usually less than 1 second in duration. After the short training period, the 
transmitter begins to transmit the data sequence {a(n)}. To track the possible slow time 
variations in the channel, the equalizer coefficients must continue to be adjusted in an adaptive 
manner while receiving data. This is usually accomplished, as illustrated in Figure 10-7, by 
treating the decisions at the output of the decision device as correct, and using the decisions in 
place of the reference d(n) to generate the error signal. This approach works quite well when 
decision errors occur infrequently, e.g. less than one error in 100 data symbols. The occai ional 
decision errors cause only a small misadjustment in the equalizer coefficients. 

Experiment on Adaptive Channel Equalization 

The objective of this experiment is to investigate the performance of an adaptive 
equalizer for data transmission over a channel that causes intersymbol interference. The 
basic configuration of the system to be simulated is shown in Figure 10-8. As we observe, 
five basic software modules are required. Note that we have avoided carrier modulation 
and demodulation, which is required in a telephone channel modem. Thi ; is done in order 
to simplify the simulation program. However, all processing involves complex arithmetic 
operations. 
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Figure 10-7: Application of Adaptive Filtering to Adaptive Channel Equalization 
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Figure 10-8: Experiment for Investigating the performance of an Adaptive Equalizer in the pres- 
ence of Channel Distortion 



The five software modules are as follows: 

1. The data generator module is used to generate a sequence of complex -valued infor- 
mation symbols a(n). In particular, let us employ four equally probable symbols 
s +js,s -js,-s + js and -s -js, where s is a scale factor that may be set to s = 1 , or 
it can be a parameter selected by the programmer. 

2. The channel filter module is an FIR filter with coefficients {c (n ), < n < K - 1} which 
simulate the channel distortion. For distortionless transmission, we set c(0) = 1 and 
c (n ) = for 1 < n < K - 1 . The length K of the filter is a parameter that is selected by 
the programmer. 
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3. The noise generator module is used to generate additive noise that i;> usually present 
in any digital communication system. If we are modeling noise that is generated by 
electronic devices, the noise distribution should be Gaussian with zero mean. Such 
noise can be generated by adding six to twelve uniformly distributed -andom numbers 
and scaling the sum to obtain the desired noise power. 

4. The adaptive equalizer module is an FIR filter with tap coefficients 
{h (k), < k < N - 1} which are adjusted by the LMS algorithm. However, due to the 
use of complex arithmetic, the recursive equation in the LMS algcrithm is slightly 
modified to 

K(k) = h„_ f (k) + Ae(n)x\n-k) (10-15) 
where the asterisk denotes the complex conjugate. 

5. The decision device module which takes the estimate d{n) and quantizes it to one of 
the four possible signal points on the basis of the following decision rule: 

<R[d(n)] > 1 and 3[d(n)] > 1 -> l+j 

WLSin)] > 1 and3[cz(n)] < 1 -> 1 -j 

5R[d(n)] < 1 and 3[d(n)] > 1 -> -1 +j 

9i[d(n)] < 1 and 3[d(n)] < 1 -> -1 - j 

The effectiveness of the equalizer in suppressing the ISI introduced by ihe channel filter 
may be seen by displaying the following relevant sequences in a two-dimensional (real- 
imaginary) display. The data generator output [a(n)} should consist of four points with 
values +1 ± j. The effect of channel distortion and additive noise may be viewed by 
displaying the sequence {x(n)} at the input to the equalizer. The effectiveness of the 
adaptive equalizer may be assessed by displaying its output {d(n)} after convergence of 
its coefficients. The short-time average squared error ASE(« ) may also be used to monitor 
the convergence characteristics of the LMS algorithm. Note that a delay must be intro- 
duced into the output of the data generator to compensate for the delays that the signal 
encounters due to the Channel Filter and the Adaptive Equalizer. For example, this delay 
may be set to the largest integer closest to (N +K)/2. Finally, an error counter may be 
used to count the number of symbol errors in the received data sequence and the ratio for 
the number of errors to the total number of symbols (error rate) may be displayed. The 
error rate may be varied by changing the level of the ISI and the level of the additive noise. 

It is suggested that simulations be performed for the following three channel cond tions: 

(a) No ISI: c(0)=l, c (n)=0, \<n<K-\ 

(b) Mild ISI: c(0)=l, c(l)=0.2, c(2)=-0.2, c(n)=0, 3<n<K-l 

(c) Strong ISI: c(0)=l. c(l)=0.5, c(2)=0.5, c(n)=0, 3<n<K-l 

The measured error rate may be plotted as a function of the signal-to-noise ratio (SNR) 
at the input to the equalizer, where SNR is defined as PJP„, where P s is the signal power, 
given as P s = s 2 and P„ is the noise power of the sequence at the output of the noise 
generator. 
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10.7 ADAPTIVE ECHO CANCELLATION 

It is well known that modems are used in the transmission of data over telephone channels. 
Shown in Figure 10-9 is a block diagram of a communication system in which two terminals, 
labeled A and B, transmit data by using modems A and B to interface to a telephone channel. 
As shown, a digital sequence a(n) is transmitted from terminal A to terminal E while a digital 
sequence b{n) is transmitted from terminal B to A. This simultaneous transmission in both 
directions is called full-duplex transmission. 
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Figure 10-9: Full-Duplex Data Transmission over Telephone Channels 

When a subscriber leases a private line from a telephone company for the purpose of 
transmitting data between terminals A and B, the telephone line provided is a four- wire line, 
which is equivalent to having two dedicated telephone (two- wire) channels, one (pair of wires) 
for transmitting data in one direction and one (pair of wires) for receiving data from the other 
direction. In such a case the two transmission paths are isolated, and consequently, there is no 
"crosstalk" or mutual interference between the two signal paths. Channel distortion is com- 
pensated by use of an adaptive equalizer, as described above, at the receiver of each modem. 

The major problem with the system shown in Figure 1 0-9 is the cost of leasing a four- wire 
telephone channel. If the volume of traffic is high and the telephone channel is used either 
continuously or frequently, as in banking transactions systems or airline reservation systems, 
the system pictured in Figure 10-9 is cost-effective. Otherwise, it is not. 

An alternative solution for low- volume, infrequent transmission of data is to use the dial-up 
switched telephone network. In this case, the local communication link between the subscriber 
and the local central telephone office is a two-wire line, called the local loop. At the central 
office, the subscriber two- wire line is connected to the main four-wire telephone channels that 
interconnect different central offices, called trunk lines, by a device called a hybrid. By using 
transformer coupling, the hybrid is tuned to provide isolation between the transmit and receiver 
channels in full-duplex operation. However, due to impedance mismatch between the hybrid 
and the telephone channel, the level of isolation is often insufficient, and consequently, some 
of the signal on the transmit side leaks back and corrupts the signal on the receiver side, causing 
an "echo" that is often heard in voice communications over telephone channels. 
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To mitigate the echoes in voice transmission, the telephone companies employ a device 
called an echo suppressor. In data transmission, the solution is to use an echo canceller within 
each modem. The echo cancellers are implemented as adaptive FIR filters with automatically 
adjustable coefficients. 

With the use of hybrids to couple a two-wire channel to a four-wire channel and echo 
cancellers at each modem to estimate and subtract out the echoes, the data communication 
system for the dial-up switched network takes the form shown in Figure 10-10. A hybrid is 
needed at each modem to isolate the transmitter from the receiver and to couple to the two- wire 
local loop. Hybrid A is physically located at the central office of subscriber A while Hybrid B 
is located at the central office to which subscriber B is connected. The two central offices are 
connected by a four-wire line, one pair used for transmission from A to B and the other pair is 
used for transmission in the reverse direction, from B to A. An echo at terminal A due to the 
hybrid A is called a near-end-echo, while an echo at terminal A due to the hybrid B is termed 
a far-end-echo. Both types of echos are usually present in data transmission and must be 
removed by the echo canceller. 

Suppose that we neglect the channel distortion for purposes of this discussion, and let us 
deal with echoes only. The signal received at modem A may be expressed as 

S RA (t) = A,S,(f) +AJ A (t - d,)+A T S A (t - d 2 ) (10-16) 

where S B (t) is the desired signal to be demodulated at modem A, S A (t - d t ) is the near-end echo 

due to hybrid A, S A (t - d 2 ) is the far-end echo due to hybrid B, and A r , i = 1 , 2, 3 are the corre- 
sponding amplitudes of the three signal components and {d u d^. are the delays associated with 
the echo components. A further disturbance that corrupts the received signal is additive noise, 
so that the received signal at modem A is 

r A (t) = S HA (t) + w(t) (10-17) 
where w(t) represents the additive noise process. 

The adaptive echo canceller attempts to estimate adaptively the two-echo components. 
If its coefficients are h(n),n = 0, 1, ...,M — 1, its output is 

u- i 

s A (n) = I h(k)a(n-k) (10-18) 

A - 

which is an estimate of the echo signal components. This estimate is subtracted from the sampled 
received signal and the resulting error signal can be minimized in the least- squares sense to 
adjust optimally the coefficients of the echo canceller. 

Experiment on Echo Cancellation 

The objective of this experiment is to investigate the effectiveness of echo cancellation 
in a data communication system. The basic configuration of the system to be sirrulated 
is shown in Figure 10-11. The following basic software modules are used in the 
experiment. 
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Figure 10-10: Block Diagram Model of a Digital Communication System that uses Echo Canceller in the Modems. 
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Figure 10-11 : Block Diagram of System for Investigating the Performance of an Adaptive Echo 
Canceller 

1. Two separate data generator modules are used to generate two binary (±l's) infor- 
mation sequences {a(n)} and {b(n)}. The sequence {b(n)} is the desired sequence 
which is detected. The sequence {«(«)} serves as the interfering sequences. 

2. Two FIR filter modules are needed. One FIR filter of length M is used to generate 
the (single) echo signal which serves as the interference in the detection of the sequence 
{£>(«)} . The second FIR filter is the adaptive echo canceller which has length ;V > M, 
say N = M + D , where D is the echo delay. The LMS algorithm is used to adapt the 
coefficients of the echo canceller. 

3. A noise generator module identical to the one used in the adaptive equalization 
experiment. 

4. The decision-device module compares the estimate b(n) with the threshold zero. If 
b(n) > the decision b(n) = 1 is made. If b(n)< 0, the decision b (n ) = -1 is made. 
The decision is compared with the transmitted bit b(n) and an error is counted if 
b(n)*b(n). The error rate may be displayed for different signal-to noise ratios. 
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In practice the echo signal is usually of higher power than the desired signal b(n). For 
this reason, the step size parameter A in the LMS algorithm may have to be made much 
smaller than the value given by ( 1 0-9) in order to obtain sufficient suppression of the echo. 
One strategy is to begin with the value given by (10-9) and, then, to reduce this value by 
factors of two every one to two hundred iterations. The convergence rale of the LMS 
algorithm may be monitored by computing and displaying the ASE(« ). 

After the coefficients of the echo canceller have converged, it is interesting to compare 
its output {s A (n -£))} with the exact value of the echo signal {s A (n -D)}. These two 
signal sequences may also be displayed for purposes of comparison. 



In this chapter we introduced the reader to the theory and implementation of adaptive FIR filters 
with applications to system identification, interference suppression, narrowband frequency 
enhancement, adaptive equalization, and echo cancellation. Experiments were formulated 
involving these applications of adaptive filtering which may be implemented on the ADSP-2 1 00 
family of microcomputers. 
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